open('output1.txt', 'w').write("Hello guys") versus openvar.write("Hello guys") - python

When I do
open('output1.txt', 'w').write("Hello guys")
A file called output1.txt is immediatly created and contains the string "Hello guys".
But when I do
openvar = open('output2.txt', 'w')
openvar.write("Hello guys")
Then only the file output2.txt is created. The text "Hello guys" will only be seen on the output2.txt when I do openvar.close().
Why is this behaviour different only because of an extra variable assignment?

Python detects that the file object is not referenced anymore in your first case so the garbage collector will collect it and call its destructor which closes the file.
In the second case the file object still exists so it's not closed automatically.
You should always close your files when you area done. The with statement makes this pretty easy:
with open('output.txt', 'w') as f:
f.write('Hello')
As soon as the block is left, the file is closed again - even if the code inside the block raises an exception.
If you need to keep the file open for some reason (e.g. because you are going to write more data), you can .flush() it to force the system to empty the write buffer and actually writes it to the file.

In the first case garbage collector will close file for you. There are no references to that file. In the second case you have created a reference to the file. You have to manualy close it, or it will be closed by garbage collector when reference is destroyed.

Related

Can relying on python3 to automatically close the files result in an unexpected behaviour?

Can something wrong happen with the following implementation?
def ReadFromFile(file_name):
return [line for line in open(file_name)]
def AppendToFile(new_line, file_name):
open(file_name, 'a').write(new_line)
I am not explicitly calling close() method after reading / writing to the file. My understanding is that semantically the program has to behave as if the file is always closed at the end of each function.
Can the following use of these functions give unexpected results, e.g.
original_lines = ReadFromFile("file.txt")
for line in original_lines:
AppendToFile(line, "file.txt")
modified_lines = ReadFromFile("file.txt")
I would expect e.g. len(modified_lines) == len(original_lines) * 2. Can that ever not be the case?
When we write onto a file using any of the write functions. Python holds everything to write in the file in a buffer and pushes it onto the actual file on the storage device either at the end of the python file or if it encounters a close() function.
Also if we opened another file with same file object then the first file will be closed by python for example:
file_object1 = open(file1,'r')
file_object1 = open(file2, 'r')
Here in this scenario also file1 will be automatically closed
So if the file terminates in between then the data is not stored in the file. So I would suggest two options:
use with because as soon as you get out of the block or encounter any exception it closes the file,
with open(filename , file_mode) as file_object:
do the file manipulations........
or you can use the flush() function if you want to force python to write contents of buffer onto storage without closing the file.
file_object.flush()
For Reference: https://lerner.co.il/2015/01/18/dont-use-python-close-files-answer-depends/

Is there a more concise way to read csv files in Python?

with open(file, 'rb') as readerfile:
reader = csv.reader(readerfile)
In the above syntax, can I perform the first and second line together? It seems unnecessary to use 2 variables ('readerfile' and 'reader' above) if I only need to use the latter.
Is the former variable ('readerfile') ever used?
Can I use the same variable name for both is that bad form?
You can do:
reader = csv.reader(open(file, 'rb'))
but that would mean you are not closing your file explicitly.
with open(file, 'rb') as readerfile:
The first line opens the file and stores the file object in readerfile. The with statement ensures that the file is closed when you exit the block by any means, including exceptions.
reader = csv.reader(readerfile)
The second line creates a CSV reader object using the file object. It needs the file object (otherwise where would it read the data from?). Of course you could conceivably store it in the same variable
readerfile = csv.reader(readerfile)
if you wanted to (and don't plan on using the file object again), but this will likely lead to confusion for readers of your code.
Note that you haven't read anything yet! You still need to iterate over the reader object in order to get the data that you're interested in, and if you close the file before that happens then the reader object won't work. The file object is used behind the scenes by the reader object, even if you "hide" it by overwriting the readerfile variable.
Lastly, if you really want to do everything on one line, you could conceivably define a function that abstracts the with statement:
def with1(context, func):
with context as x:
return func(x)
Now you can write this as one line:
data = with1(open(file, 'rb'), lambda readerfile: list(csv.reader(readerfile)))
It's by no means clearer, however.
This is not recommended at all
Why is it important to use one line?
Most python programmers know well the benefits of using the with statement. Keep in mind that readers might be lazy (that is -read line by line-) on some cases. You want to be able to handle the file with the correct statement, ensuring the correct closing, even if errors arise.
Nevertheless, you can use a one liner for this, as stated in other answers:
reader = csv.reader(open(file, 'rb'))
So basically you want a one-liner?
reader = csv.reader(open(file, 'rb'))
As said before, the problem with that is with open() allows you to do the following steps in one time:
Open the file
Do what you want with the file (inside your open block)
Close the file (that is implicit and you don't have to specify it)
If you don't use with open but directly open, you file stays opened until the object is garbage collected, and that could lead to unpredicted behaviour in some cases.
Plus, your original code (two lines) is much more readable than a one-liner.
If you put them together, then the file won't be closed automatically -- but that often doesn't really matter, since it will be closed automatically when the script terminates.
It's not common to need to reference the raw file once acsv.readerinstance has been created from (except possibly to explicitly close it if you're not using awithstatement).
If you use the same variable name for both, it will probably work because thecsv.readerinstance will still hold a reference to the file object, so it won't be garbage collected until the program ends. It's not a commonly idiom, however.
Since csv files are often processed sequentially, the following can be a fairly concise way to do it since thecsv.readerinstance frequently doesn't really need to be given a variable name and it will close the file properly even if an exception occurs:
with open(file, 'rb') as readerfile:
for row in csv.reader(readerfile):
process the data...

opening & closing file without file object in python

Opening & closing file using file object:
fp=open("ram.txt","w")
fp.close()
If we want to Open & close file without using file object ,i.e;
open("ram.txt","w")
Do we need to write close("poem.txt") or writing close() is fine?
None of them are giving any error...
By only writing close() ,How it would understand to what file we are referencing?
For every object in memory, Python keeps a reference count. As long as there are no more references to an object around, it will be garbage collected.
The open() function returns a file object.
f = open("myfile.txt", "w")
And in the line above, you keep a reference to the object around in the variable f, and therefore the file object keeps existing. If you do
del f
Then the file object has no references anymore, and will be cleaned up. It'll be closed in the process, but that can take a little while which is why it's better to use the with construct.
However, if you just do:
open("myfile.txt")
Then the file object is created and immediately discarded again, because there are no references to it. It's gone, and closed. You can't close it anymore, because you can't say what exactly you want to close.
open("myfile.txt", "r").readlines()
To evaluate this whole expression, first open is called, which returns a file object, and then the method readlines is called on that. Then the result of that is returned. As there are now no references to the file object, it is immediately discarded again.
I would use with open(...), if I understand the question correctly.
This answer might help you What is the python keyword "with" used for?.
In answer to your actual question... a file object (what you get back when you call open) has the reference to the file in it. So when you do something like:
fp = open(myfile, 'w')
fp.write(...)
fp.close()
Everything in the above, including both write and close, know they reference myfile because that's the file that fp is associated with. I'm not sure what fp.close(myfile) actually does, but it certainly doesn't need the filename after it's open.
Better constructions like
with open(myfile,'w') as fp:
fp.write(...)
don't change this; in this case, fp is also a context manager, but still contains the pointer to myfile; there's no need to remind it.

Does reading an entire file leave the file handle open?

If you read an entire file with content = open('Path/to/file', 'r').read() is the file handle left open until the script exits? Is there a more concise method to read a whole file?
The answer to that question depends somewhat on the particular Python implementation.
To understand what this is all about, pay particular attention to the actual file object. In your code, that object is mentioned only once, in an expression, and becomes inaccessible immediately after the read() call returns.
This means that the file object is garbage. The only remaining question is "When will the garbage collector collect the file object?".
in CPython, which uses a reference counter, this kind of garbage is noticed immediately, and so it will be collected immediately. This is not generally true of other python implementations.
A better solution, to make sure that the file is closed, is this pattern:
with open('Path/to/file', 'r') as content_file:
content = content_file.read()
which will always close the file immediately after the block ends; even if an exception occurs.
Edit: To put a finer point on it:
Other than file.__exit__(), which is "automatically" called in a with context manager setting, the only other way that file.close() is automatically called (that is, other than explicitly calling it yourself,) is via file.__del__(). This leads us to the question of when does __del__() get called?
A correctly-written program cannot assume that finalizers will ever run at any point prior to program termination.
-- https://devblogs.microsoft.com/oldnewthing/20100809-00/?p=13203
In particular:
Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. An implementation is allowed to postpone garbage collection or omit it altogether — it is a matter of implementation quality how garbage collection is implemented, as long as no objects are collected that are still reachable.
[...]
CPython currently uses a reference-counting scheme with (optional) delayed detection of cyclically linked garbage, which collects most objects as soon as they become unreachable, but is not guaranteed to collect garbage containing circular references.
-- https://docs.python.org/3.5/reference/datamodel.html#objects-values-and-types
(Emphasis mine)
but as it suggests, other implementations may have other behavior. As an example, PyPy has 6 different garbage collection implementations!
You can use pathlib.
For Python 3.5 and above:
from pathlib import Path
contents = Path(file_path).read_text()
For older versions of Python use pathlib2:
$ pip install pathlib2
Then:
from pathlib2 import Path
contents = Path(file_path).read_text()
This is the actual read_text implementation:
def read_text(self, encoding=None, errors=None):
"""
Open the file in text mode, read it, and close the file.
"""
with self.open(mode='r', encoding=encoding, errors=errors) as f:
return f.read()
Well, if you have to read file line by line to work with each line, you can use
with open('Path/to/file', 'r') as f:
s = f.readline()
while s:
# do whatever you want to
s = f.readline()
Or even better way:
with open('Path/to/file') as f:
for line in f:
# do whatever you want to
Instead of retrieving the file content as a single string,
it can be handy to store the content as a list of all lines the file comprises:
with open('Path/to/file', 'r') as content_file:
content_list = content_file.read().strip().split("\n")
As can be seen, one needs to add the concatenated methods .strip().split("\n") to the main answer in this thread.
Here, .strip() just removes whitespace and newline characters at the endings of the entire file string,
and .split("\n") produces the actual list via splitting the entire file string at every newline character \n.
Moreover,
this way the entire file content can be stored in a variable, which might be desired in some cases, instead of looping over the file line by line as pointed out in this previous answer.

Closing a file in python opened with a shortcut

I am just beginning with python with lpthw and had a specific question for closing a file.
I can open a file with:
input = open(from_file)
indata = input.read()
#Do something
indata.close()
However, if I try to simplify the code into a single line:
indata = open(from_file).read()
How do I close the file I opened, or is it already automatically closed?
Thanks in advance for the help!
You simply have to use more than one line; however, a more pythonic way to do it would be:
with open(path_to_file, 'r') as f:
contents = f.read()
Note that with what you are doing before, you could miss closing the file if an exception was thrown. The 'with' statement here will cause it be closed even if an exception is propagated out of the 'with' block.
Files are automatically closed when the relevant variable is no longer referenced. It is taken care of by Python garbage collection.
In this case, the call to open() creates a File object, of which the read() method is run. After the method is executed, no reference to it exists and it is closed (at least by the end of script execution).
Although this works, it is not good practice. It is always better to explicitly close a file, or (even better) to follow the with suggestion of the other answer.

Categories