I want to open a file that may be gzipped or not. To open the file, I use either
with open(myfile, 'r') as f:
some_func(f) # arbitrary function
or
import gzip
with gzip.open(myfile, 'r') as f:
some_func(f)
I want to check if myfile has a gz extension or not, and then from there decide which with statement to use. Here's what I have:
# myfile_gzipped is a Boolean variable that tells me whether it's gzipped or not
if myfile_gzipped:
with gzip.open(myfile, 'rb') as f:
some_func(f)
else:
with open(myfile, 'r') as f:
some_func(f)
How should I go about it, without having to repeat some_func(f)?
if myfile_gzipped:
f = gzip.open(myfile, 'rb')
else:
f = open(myfile, 'r')
with f:
some_func(f)
The result of open and gzip.open is a context manager. with invokes the entry and exit methods on context managers. There is nothing special in calling those functions inside the with statement itself.
I find that ExitStacks can be helpful in these cases:
from contextlib import ExitStack
with ExitStack() as stack:
if myfile_gzipped:
f = stack.enter_context(gzip.open(myfile, 'rb'))
else:
f = stack.enter_context(open(myfile, 'r'))
some_func(f)
You can use the ternary operator to evaluate two different expressions based on a condition:
with gzip.open(myfile, 'rb') if myfile_gzipped else open(myfile, 'r') as f:
some_func(f)
You don't have to put open and with on the same line.
You can open the file as one step, and then do with f later.
if myfile_gzipped:
f = gzip.open(myfile, 'rb')
else:
f = open(myfile, 'r')
with f:
some_func(f)
Related
I am trying to process several files into a single, merged csv file using python. So far, I have
files = ["file1.txt", "file2.txt", "file3.txt"]
def doSomething(oldfile):
content = []
with open oldfile as file:
content = file.read().splitlines()
file.close()
return content.reverse()
with open("newfile.txt", "w") as file:
w = csv.writer(file, dialect = "excel-tab")
for i in range(0, len(files)):
w. writerows(doSomething(files[i])
file.close()
The new file is being created, but there is nothing in it. I am curious about what is going on.
Thanks!
For starters, list.reverse() reverses the list in place and doesn't return anything so you're essentially returning None from your doSomething() function. You'll actually want to split that into two lines:
content.reverse()
return content
If you want to streamline your code, here's a suggestion:
def doSomething(oldfile):
with open(oldfile, "r") as f:
return reversed(f.read().splitlines())
files = ["file1.txt", "file2.txt", "file3.txt"]
with open("newfile.txt", "wb") as file:
w = csv.writer(file, dialect = "excel-tab")
for current_file in files:
w.writerows(doSomething(current_file))
I think your program crashes for several reasons:
open(..) is a function, so you cannot write:
with open oldfile as file:
a with statement for files is used to enforce closing of a file, so file.close() is actually not necessary.
.reverse() works inplace: it returns None, you can use reversed(..) for that.
You can fix it with:
files = ["file1.txt", "file2.txt", "file3.txt"]
def doSomething(oldfile):
content = []
with open(oldfile,'r') as file:
return list(reversed(file))
with open("newfile.txt", "w") as file:
w = csv.writer(file, dialect = "excel-tab")
for oldfile in files:
w.writerows(doSomething(oldfile))
I also used a for loop over the list, instead of the indices, since that is more "pythonic". Furthermore a file is iterable over its rows. So one can use reversed(file) to obtain the lines of the file in reverse.
Currently I'm using this:
f = open(filename, 'r+')
text = f.read()
text = re.sub('foobar', 'bar', text)
f.seek(0)
f.write(text)
f.close()
But the problem is that the old file is larger than the new file. So I end up with a new file that has a part of the old file on the end of it.
If you don't want to close and reopen the file, to avoid race conditions, you could truncate it:
f = open(filename, 'r+')
text = f.read()
text = re.sub('foobar', 'bar', text)
f.seek(0)
f.write(text)
f.truncate()
f.close()
The functionality will likely also be cleaner and safer using open as a context manager, which will close the file handler, even if an error occurs!
with open(filename, 'r+') as f:
text = f.read()
text = re.sub('foobar', 'bar', text)
f.seek(0)
f.write(text)
f.truncate()
The fileinput module has an inplace mode for writing changes to the file you are processing without using temporary files etc. The module nicely encapsulates the common operation of looping over the lines in a list of files, via an object which transparently keeps track of the file name, line number etc if you should want to inspect them inside the loop.
from fileinput import FileInput
for line in FileInput("file", inplace=1):
line = line.replace("foobar", "bar")
print(line)
Probably it would be easier and neater to close the file after text = re.sub('foobar', 'bar', text), re-open it for writing (thus clearing old contents), and write your updated text to it.
I find it easier to remember to just read it and then write it.
For example:
with open('file') as f:
data = f.read()
with open('file', 'w') as f:
f.write('hello')
To anyone who wants to read and overwrite by line, refer to this answer.
https://stackoverflow.com/a/71285415/11442980
filename = input("Enter filename: ")
with open(filename, 'r+') as file:
lines = file.readlines()
file.seek(0)
for line in lines:
value = int(line)
file.write(str(value + 1))
file.truncate()
Honestly you can take a look at this class that I built which does basic file operations. The write method overwrites and append keeps old data.
class IO:
def read(self, filename):
toRead = open(filename, "rb")
out = toRead.read()
toRead.close()
return out
def write(self, filename, data):
toWrite = open(filename, "wb")
out = toWrite.write(data)
toWrite.close()
def append(self, filename, data):
append = self.read(filename)
self.write(filename, append+data)
Try writing it in a new file..
f = open(filename, 'r+')
f2= open(filename2,'a+')
text = f.read()
text = re.sub('foobar', 'bar', text)
f.seek(0)
f.close()
f2.write(text)
fw.close()
The purpose of this program is to backspace three times in a data file:
ofile = open(myfile, 'r')
file = open(myfile, 'r')
with open(myfile, 'rb+') as filehandle:
filehandle.seek(-3, os.SEEK_END)
filehandle.truncate()
I then attempted to add additional text after this by switching to the "write" function:
ofile = open(myfile, 'r')
file = open(myfile, 'r')
with open(myfile, 'rb+') as filehandle:
filehandle.seek(-3, os.SEEK_END)
filehandle.truncate()
ofile = open(myfile, 'w')
ofile.write('*')
This, however, overrides the entire data set and writes only "*" on a blank document. How do I add to the file without removing the rest of the content?
You need to use the append flag, instead of write. So, ofile = open(myfile, 'a')
E.g., I wrote some data to the file and then try to read them:
mocked_open = mock_open()
with patch('__builtin__.open', mocked_open, create=True):
with open('file', 'w') as f:
f.write('text')
with open('file', 'r') as f:
res = f.read()
But after this,res is empty. How to get written data for this file?
I have to define a function: save_file(filename, new_list) which takes a file name and a new list and writes that list to the file in the correct format.
So, for example,
save_file(’file.txt’, load_file(’file.txt’))
(load_file is a predefined function which opens and reads the file)
should overwrite the new list with exactly the same content.
I have no clue how to go about this, any ideas?
The load_file function seems to work but can't seem to get the save_file function working.
This is what I have so far:
I have this so far:
def load_file(filename):
f = open(filename, 'Ur')
for line in f:
print line
f.close()
def save_file(filename, new_list):
with open(new_list, 'Ur') as f1:
with open(filename, 'w') as f2:
f2.write(f1.read())
Since new_list is clearly a list of lines, not a filename, you don't need all the stuff with opening and reading it. And you also can't do saving in a single write.
But you can do it almost that simply.
You didn't specify whether the lines in new_list still have their newlines. Let's first assume they do. So, all you have to do is:
def save_file(filename, new_list):
with open(filename, 'w') as f:
f.write(''.join(new_list))
… or …:
def save_file(filename, new_list):
with open(filename, 'w') as f:
f.writelines(new_list)
But your teacher may be expecting something like this:
def save_file(filename, new_list):
with open(filename, 'w') as f:
for line in new_list:
f.write(line)
What if the newlines were stripped off, so we have to add them back? Then things are a bit more complicated the first two ways, but still very easy the third way:
def save_file(filename, new_list):
with open(filename, 'w') as f:
f.write('\n'.join(new_list) + '\n')
def save_file(filename, new_list):
with open(filename, 'w') as f:
f.writelines(line + '\n' for line in new_list)
def save_file(filename, new_list):
with open(filename, 'w') as f:
for line in new_list:
f.write(line + '\n')
Meanwhile, you have not gotten load_file to work. It's supposed to return a list of lines, but it doesn't return anything (or, rather, it returns None). printing something just prints it out for the user to see, it doesn't store anything for later use.
You want something like this:
def load_file(filename):
lines = []
with open(filename, 'Ur') as f:
for line in f:
lines.append(line)
return lines
However, there's a much simpler way to write this. If you can do for line in f:, then f is some kind of iterable. It's almost the same thing as a list—and if you want to make it into an actual list, that's trivial:
def load_file(filename):
with open(filename, 'Ur') as f:
return list(f)
def save_file(filename, new_list):
with open(new_list, 'r') as a:
with open(filename, 'w') as b:
b.write(a.read())
Just a small adjustment to SaltChicken's answer.
Use print >> to make it simply :
>>> with open('/src/file', 'r') as f1, open('/dst/file', 'w') as f2:
... print >> f2, f1.read()
Inspired from What does this code mean: "print >> sys.stderr".