I am new to python and have until now written only a few programs to help with my job (I'm a sysadmin). I am writing this script now which will write the output of a MySQL query to a file. While in-between looping, I want to check for an extra condition and if the condition does not match, I want to close the file that I am writing to without saving what it has already written to the file. Like 'exit without saving'. I wrote this simple code to see if not closing the file with a close() will exit without saving, but it is creating the file with the content after I run and exit this code. So, is there a legal way in Python to exit a file without saving?
#/usr/bin/python
fo=open('tempfile.txt','a')
fo.write('This content\n')
P.S:- Python version is 2.4.3 (sorry, cannot upgrade)
There is no such concept in programming.
For the vast majority of the programming languages out there, the write command will attempt to put data directly in the file. This may or may not occur instantly for various reasons so many languages also introduce the concept of flush which will guarantee that your data is written to the file
What you want to do instead is to write all your data to a huge buffer (a string) then conditionally write or skip writing to the file.
Use the tempfile module to create your temporary, then if you need to save it you can do so explicitly using shutil.copyfileobj.
See Quick way to save a python TempFile?
Note that this is only if you absolutely need a temporary file (large amounts of data, etc.); if your contents are small then just using a stringbuffer and only writing it if you need to is a better approach.
Check for the condition before opening the file:
#/usr/bin/python
if condition():
fo=open('tempfile.txt','a')
fo.write('This content\n')
The safest way to do this is not to write to, or even open, the file until you know you want to save it. You could, for example, save what you might eventually write to a string, or list of same, so you can write them once you've decided to.
Related
This has been asked, and it has been answered. Though, the answers do not favor my situation or what I'm trying to achieve.
I have 100's of text files I need to update. I need to update the same one line in each file. I want to open up a text file and only modify a single line. I do not want to re-write the text file using write() or writelines(). I also do not want to use fileinput.input() because that too is re-writing the text file using print statements.
The files contain thousands of lines of critical data and I cannot trust that Python will recreate everything correctly in the files (I understand this will probably never happen). There are several lines that pertain to footer data which is non-critical and that's what I am updating.
How can one update a single line in a text file, without recreating the file. Appending one line in a text must be possible.
Thanks in advance
I don't think this is possible in any programming language at the file level. Files simply don't work that way -- especially text-based files which you are likely replacing with different-length data in the middle. You would need raw disk level access (making this a stupidly difficult problem).
If you really want to pursue it, check the raw disk question here:
Is it possible to get writing access to raw devices using python with windows?
EVEN THEN: I'm pretty sure at some level AT LEAST an entire block of data will be read from the drive and re-written (this is physically how drives work if I recall).
I have a python script which starts by reading a few large files and then does something else. Since I want to run this script multiple times and change some of the code until I am happy with the result, it would be nice if the script did not have to read the files every time anew, because they will not change. So I mainly want to use this for debugging.
It happens to often, that I run scripts with bugs in them, but I only see the error message after minutes, because the reading took so long.
Are there any tricks to do something like this?
(If it is feasible, I create smaller test files)
I'm not good at Python, but it seems to be able to dynamically reload code from a changed module: How to re import an updated package while in Python Interpreter?
Some other suggestions not directly related to Python.
Firstly, try to create a smaller test file. Is the whole file required to demonstrate the bug you are observing? Most probably it is only a small part of your input file that is relevant.
Secondly, are these particular files required, or the problem will show up on any big amount of data? If it shows only on particular files, then once again most probably it is related to some feature of these files and will show also on a smaller file with the same feature. If the main reason is just big amount of data, you might be able to avoid reading it by generating some random data directly in a script.
Thirdly, what is a bottleneck of your reading the file? Is it just hard drive performance issue, or do you do some heavy processing of the read data in your script before actually coming to the part that generates problems? In the latter case, you might be able to do that processing once and write the results to a new file, and then modify your script to load this processed data instead of doing the processing each time anew.
If the hard drive performance is the issue, consider a faster filesystem. On Linux, for example, you might be able to use /dev/shm.
I have a Python program that had some kind of error that prevents it from saving my data. The program is still running, but I cannot save anything. Unfortunately, I really need to save this data and there seems to be no other way to access it.
Does the DMP file created for the process through the task manager contain the data my program collected, and if so, how do I access it?
Thanks.
Does it contain some or all of the current execution state of your program? Yes. Is it in a form that you could easily extract the information in the user-level format you are probably looking for from it? Probably not. It will dump the state of the entire Python interpreter, including the data as represented in memory for the specific Python program that is running. To reconstruct that data, I'm pretty sure you'd need to run the Python interpreter itself in debug mode, then try to reconstruct your data from whatever your C debugger can piece together. If this sounds very difficult or impossible to you, then you probably have some understanding of what it entails.
I have a file that I want to read. The file may at any time be overwritten by another process. I do not want to block that writing. I am prepared to manage corruption to the data that I read, but do not want my reading to be in any way change the behaviour of the writing process.
The process that is writing the file is a delphi program running locally on the server. It opens the file using fmCreate. fmCreate tries to open the file exclusively and fails if there are any other handles on the file.
I am reading the file from a python script that accesses the file remotely across our network.
I am interested in whether there is a solution, independent of whether it is supported by python or delphi. I want to know if there is any way of achieving this under windows without modifying the writing program.
Edit: To reiterate, this is not a duplicate. The other question was trying to get read access to a file that is being written to. I want to the writer to have access to a file that I have open for reading. These are different questions (although I fear the answer will be similar, that it can't be done.)
I think the real answer here, all of these years later, is to use opportunistic locks. With this, you can open the file for read access, while telling the OS that you want to be notified if another program wants to access the file. Basically, you can use the file as long as you like, and then back off if someone else needs it. This avoids the sharing/access violation that the other program would normally get, if you had just opened the file "normally".
There is an MSDN article on Opportunistic Locks. Raymond Chen also has a blog article about this, complete with sample code: Using opportunistic locks to get out of the way if somebody wants the file
The key is calling the DeviceIoControl function, with the FSCTL_REQUEST_OPLOCK flag, and passing it the handle to an event that you previously created by calling CreateEvent.
It should be straightforward to use this from Delphi, since it supports calling Windows API functions. I am not so sure about Python. But, given the arrangement in the question, it should not be necessary to modify the Python code. Just make your Delphi code use the opportunistic lock when it opens the file, and let it get out of the way when the Python script needs the file.
Also much easier and lighter weight than a filter driver or the Volume Shadow Copy service.
You can setup a filter driver which can act in two ways: (1) modify the flags when the file is opened, and (2) it can capture the data when it's written to the file and save a copy of the data elsewhere.
This approach is much more lightweight and efficient than volume shadow copy service, mentioned in comments, however it requires having a filter driver. There exist several drivers on the market (i.e. those are products which include a driver and let you write business logic in user mode) yet they are costly and can be an overkill in your case. Still, if you need the thing for private use only, contact me privately for a license for our CallbackFilter.
Update: if you want to let the writer open the file which has been already opened, then a filter which will modify flags when the file is being opened is your only option.
I am writing a script that will be polling a directory looking for new files.
In this scenario, is it necessary to do some sort of error checking to make sure the files are completely written prior to accessing them?
I don't want to work with a file before it has been written completely to disk, but because the info I want from the file is near the beginning, it seems like it could be possible to pull the data I need without realizing the file isn't done being written.
Is that something I should worry about, or will the file be locked because the OS is writing to the hard drive?
This is on a Linux system.
Typically on Linux, unless you're using locking of some kind, two processes can quite happily have the same file open at once, even for writing. There are three ways of avoiding problems with this:
Locking
By having the writer apply a lock to the file, it is possible to prevent the reader from reading the file partially. However, most locks are advisory so it is still entirely possible to see partial results anyway. (Mandatory locks exist, but a strongly not recommended on the grounds that they're far too fragile.) It's relatively difficult to write correct locking code, and it is normal to delegate such tasks to a specialist library (i.e., to a database engine!) In particular, you don't want to use locking on networked filesystems; it's a source of colossal trouble when it works and can often go thoroughly wrong.
Convention
A file can instead be created in the same directory with another name that you don't automatically look for on the reading side (e.g., .foobar.txt.tmp) and then renamed atomically to the right name (e.g., foobar.txt) once the writing is done. This can work quite well, so long as you take care to deal with the possibility of previous runs failing to correctly write the file. If there should only ever be one writer at a time, this is fairly simple to implement.
Not Worrying About It
The most common type of file that is frequently written is a log file. These can be easily written in such a way that information is strictly only ever appended to the file, so any reader can safely look at the beginning of the file without having to worry about anything changing under its feet. This works very well in practice.
There's nothing special about Python in any of this. All programs running on Linux have the same issues.
On Unix, unless the writing application goes out of its way, the file won't be locked and you'll be able to read from it.
The reader will, of course, have to be prepared to deal with an incomplete file (bearing in mind that there may be I/O buffering happening on the writer's side).
If that's a non-starter, you'll have to think of some scheme to synchronize the writer and the reader, for example:
explicitly lock the file;
write the data to a temporary location and only move it into its final place when the file is complete (the move operation can be done atomically, provided both the source and the destination reside on the same file system).
If you have some control over the writing program, have it write the file somewhere else (like the /tmp directory) and then when it's done move it to the directory being watched.
If you don't have control of the program doing the writing (and by 'control' I mean 'edit the source code'), you probably won't be able to make it do file locking either, so that's probably out. In which case you'll likely need to know something about the file format to know when the writer is done. For instance, if the writer always writes "DONE" as the last four characters in the file, you could open the file, seek to the end, and read the last four characters.
Yes it will.
I prefer the "file naming convention" and renaming solution described by Donal.