When I am trying to run this script I am getting this error:
ValueError: I/O operation on closed file.
I checked some similar questions and the doc, yet no success. And while the error is clear enough I haven't been able to figure it out. Apparently I am missing something.
# -*- coding: utf-8 -*-
import os
import re
dirpath = 'path\\to\\dir'
filenames = os.listdir(dirpath)
nb = 0
open('path\\to\\dir\\file.txt', 'w') as outfile:
for fname in filenames:
nb = nb+1
print fname
print nb
currentfile = os.path.join(dirpath, fname)
open(currentfile) as infile:
for line in infile:
outfile.write(line)
Edit: Since i removed the with from open the message error changed to:
`open (C:\\path\\to\\\\file.txt, 'w') as outfile` :
SyntaxError : invalid syntax with a pointer underneath as
Edit: Much confusion with this question. After all, i restored with and i fixed the indents a bit. And it is working just fine!
It looks like your outfile is at the same level as infile - which means at the end of the first with block, outfile is closed, so can't be written to. Indent your infile block to be inside your infile block.
with open('output', 'w') as outfile:
for a in b:
with open('input') as infile:
...
...
You can simplify your code here by using the fileinput module, and making the code clearer and less prone to wrong results:
import fileinput
from contextlib import closing
import os
with closing(fileinput.input(os.listdir(dirpath))) as fin, open('output', 'w') as fout:
fout.writelines(fin)
You use the context manager with which means the file will be closed when you exit the with scope. So the outfile is apparently closed when you use it.
with open('path\\to\\dir\\file.txt', 'w') as outfile:
for fname in filenames:
nb = nb + 1
print fname
print nb
currentfile = os.path.join(dirpath, fname)
with open(currentfile) as infile:
for line in infile:
outfile.write(line)
Related
I have the following code:
import re
#open the xml file for reading:
file = open('path/test.xml','r+')
#convert to string:
data = file.read()
file.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>",data))
file.close()
where I'd like to replace the old content that's in the file with the new content. However, when I execute my code, the file "test.xml" is appended, i.e. I have the old content follwed by the new "replaced" content. What can I do in order to delete the old stuff and only keep the new?
You need seek to the beginning of the file before writing and then use file.truncate() if you want to do inplace replace:
import re
myfile = "path/test.xml"
with open(myfile, "r+") as f:
data = f.read()
f.seek(0)
f.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>", r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", data))
f.truncate()
The other way is to read the file then open it again with open(myfile, 'w'):
with open(myfile, "r") as f:
data = f.read()
with open(myfile, "w") as f:
f.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>", r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", data))
Neither truncate nor open(..., 'w') will change the inode number of the file (I tested twice, once with Ubuntu 12.04 NFS and once with ext4).
By the way, this is not really related to Python. The interpreter calls the corresponding low level API. The method truncate() works the same in the C programming language: See http://man7.org/linux/man-pages/man2/truncate.2.html
file='path/test.xml'
with open(file, 'w') as filetowrite:
filetowrite.write('new content')
Open the file in 'w' mode, you will be able to replace its current text save the file with new contents.
Using truncate(), the solution could be
import re
#open the xml file for reading:
with open('path/test.xml','r+') as f:
#convert to string:
data = f.read()
f.seek(0)
f.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>",data))
f.truncate()
import os#must import this library
if os.path.exists('TwitterDB.csv'):
os.remove('TwitterDB.csv') #this deletes the file
else:
print("The file does not exist")#add this to prevent errors
I had a similar problem, and instead of overwriting my existing file using the different 'modes', I just deleted the file before using it again, so that it would be as if I was appending to a new file on each run of my code.
See from How to Replace String in File works in a simple way and is an answer that works with replace
fin = open("data.txt", "rt")
fout = open("out.txt", "wt")
for line in fin:
fout.write(line.replace('pyton', 'python'))
fin.close()
fout.close()
in my case the following code did the trick
with open("output.json", "w+") as outfile: #using w+ mode to create file if it not exists. and overwrite the existing content
json.dump(result_plot, outfile)
Using python3 pathlib library:
import re
from pathlib import Path
import shutil
shutil.copy2("/tmp/test.xml", "/tmp/test.xml.bak") # create backup
filepath = Path("/tmp/test.xml")
content = filepath.read_text()
filepath.write_text(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", content))
Similar method using different approach to backups:
from pathlib import Path
filepath = Path("/tmp/test.xml")
filepath.rename(filepath.with_suffix('.bak')) # different approach to backups
content = filepath.read_text()
filepath.write_text(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", content))
I'm trying to create_python_script function that creates a new python script in the current working directory, adds the line of comments to it declared by the 'comments' variable, and returns the size of the new file. The output I get is 0 but should be 31. Not sure what I'm doing wrong.
import os
def create_python_script(filename):
comments = "# Start of a new Python program"
with open("program.py", "w") as file:
filesize = os.path.getsize("/home/program.py")
return(filesize)
print(create_python_script("program.py"))
You forgot to actually write to the file, so it won't contain anything. Another important thing to keep in mind, is that the file is closed automatically after the with statement. In other words: nothing is written to the file until the with statement ends, so the file size is still zero in your program.
This should work:
import os
def create_python_script(filename):
comments = "# Start of a new Python program"
with open(filename, "w") as f:
f.write(comments)
filesize = os.path.getsize(filename)
return(filesize)
print(create_python_script("program.py"))
Note that the input argument was unused previously and has now been changed.
def create_python_script(filename):
comments = "# Start of a new Python program"
with open(filename, 'w') as file:
filesize = file.write(comments)
return(filesize)
print(create_python_script("program.py"))
There is a rogue indent in the exercise:
import os
def create_python_script(filename):
comments = "# Start of a new Python program"
with open(filename, "a") as newprogram:
newprogram.write(comments)
filesize = os.path.getsize(filename)
return(filesize)
print(create_python_script("program.py"))
It should be:
import os
def create_python_script(filename):
comments = "# Start of a new Python program"
with open(filename, "a") as newprogram:
newprogram.write(comments)
filesize = os.path.getsize(filename) #Error over here
return(filesize)
print(create_python_script("program.py"))
I just finished myself.
I am trying to loop through all CSV files in a directory, do a find/replace, and save the results to the same file (same name). It seems like this should be easy, but I seem to be missing something here. Here is the code that I'm working with.
import glob
path = 'C:\\Users\\ryans\\OneDrive\\Desktop\\downloads\\Products\\*.csv'
for fname in glob.glob(path):
print(str(fname))
with open(str(fname), "w") as f:
newText = f.read().replace('|', ',').replace(' ', '')
f.write(newText)
I came across the link below, and tried the concepts listed there, but nothing has worked so far.
How to open a file for both reading and writing?
You need to open the file using 'r+' instead of 'w'. See below:
import glob
path = 'C:\\Users\\ryans\\OneDrive\\Desktop\\downloads\\Products\\*.csv'
for fname in glob.glob(path):
print(str(fname))
with open(str(fname), "r+") as f:
newText = f.read().replace('|', ',').replace(' ', '')
f.write(newText)
Here is the final (working) solution.
import glob
import fileinput
path = 'C:\\Users\\ryans\\OneDrive\\Desktop\\downloads\\Products\\*.csv'
for fname in glob.glob(path):
#print(str(fname))
with open(fname, 'r+') as f:
text = f.read().replace(' ', '')
f.seek(0)
f.write(text)
f.truncate()
Thanks for the tip, agaidis!!
I am currently reproducing the following Unix command:
cat command.info fort.13 > command.fort.13
in Python with the following:
with open('command.fort.13', 'w') as outFile:
with open('fort.13', 'r') as fort13, open('command.info', 'r') as com:
for line in com.read().split('\n'):
if line.strip() != '':
print >>outFile, line
for line in fort13.read().split('\n'):
if line.strip() != '':
print >>outFile, line
which works, but there has to be a better way. Any suggestions?
Edit (2016):
This question has started getting attention again after four years. I wrote up some thoughts in a longer Jupyter Notebook here.
The crux of the issue is that my question was pertaining to the (unexpected by me) behavior of readlines. The answer I was aiming toward could have been better asked, and that question would have been better answered with read().splitlines().
The easiest way might be simply to forget about the lines, and just read in the entire file, then write it to the output:
with open('command.fort.13', 'wb') as outFile:
with open('command.info', 'rb') as com, open('fort.13', 'rb') as fort13:
outFile.write(com.read())
outFile.write(fort13.read())
As pointed out in a comment, this can cause high memory usage if either of the inputs is large (as it copies the entire file into memory first). If this might be an issue, the following will work just as well (by copying the input files in chunks):
import shutil
with open('command.fort.13', 'wb') as outFile:
with open('command.info', 'rb') as com, open('fort.13', 'rb') as fort13:
shutil.copyfileobj(com, outFile)
shutil.copyfileobj(fort13, outFile)
def cat(outfilename, *infilenames):
with open(outfilename, 'w') as outfile:
for infilename in infilenames:
with open(infilename) as infile:
for line in infile:
if line.strip():
outfile.write(line)
cat('command.fort.13', 'fort.13', 'command.info')
#!/usr/bin/env python
import fileinput
for line in fileinput.input():
print line,
Usage:
$ python cat.py command.info fort.13 > command.fort.13
Or to allow arbitrary large lines:
#!/usr/bin/env python
import sys
from shutil import copyfileobj as copy
for filename in sys.argv[1:] or ["-"]:
if filename == "-":
copy(sys.stdin, sys.stdout)
else:
with open(filename, 'rb') as file:
copy(file, sys.stdout)
The usage is the same.
Or on Python 3.3 using os.sendfile():
#!/usr/bin/env python3.3
import os
import sys
output_fd = sys.stdout.buffer.fileno()
for filename in sys.argv[1:]:
with open(filename, 'rb') as file:
while os.sendfile(output_fd, file.fileno(), None, 1 << 30) != 0:
pass
The above sendfile() call is written for Linux > 2.6.33. In principle, sendfile() can be more efficient than a combination of read/write used by other approaches.
Iterating over a file yields lines.
for line in infile:
outfile.write(line)
You can simplify this in a few ways:
with open('command.fort.13', 'w') as outFile:
with open('fort.13', 'r') as fort13, open('command.info', 'r') as com:
for line in com:
if line.strip():
print >>outFile, line
for line in fort13:
if line.strip():
print >>outFile, line
More importantly, the shutil module has the copyfileobj function:
with open('command.fort.13', 'w') as outFile:
with open('fort.13', 'r') as fort13:
shutil.copyfileobj(com, outFile)
with open('command.info', 'r') as com:
shutil.copyfileobj(fort13, outFile)
This doesn't skip the blank lines, but cat doesn't do that either, so I'm not sure you really want to.
List comprehensions are awesome for things like this:
with open('command.fort.13', 'w') as output:
for f in ['fort.13', 'command.info']:
output.write(''.join([line for line in open(f).readlines() if line.strip()]))
I have a Python script which modifies a CSV file to add the filename as the last column:
import sys
import glob
for filename in glob.glob(sys.argv[1]):
file = open(filename)
data = [line.rstrip() + "," + filename for line in file]
file.close()
file = open(filename, "w")
file.write("\n".join(data))
file.close()
Unfortunately, it also adds the filename to the header (first) row of the file. I would like the string "ID" added to the header instead. Can anybody suggest how I could do this?
Have a look at the official csv module.
Here are a few minor notes on your current code:
It's a bad idea to use file as a variable name, since that shadows the built-in type.
You can close the file objects automatically by using the with syntax.
Don't you want to add an extra column in the header line, called something like Filename, rather than just omitting a column in the first row?
If your filenames have commas (or, less probably, newlines) in them, you'll need to make sure that the filename is quoted - just appending it won't do.
That last consideration would incline me to use the csv module instead, which will deal with the quoting and unquoting for you. For example, you could try something like the following code:
import glob
import csv
import sys
for filename in glob.glob(sys.argv[1]):
data = []
with open(filename) as finput:
for i, row in enumerate(csv.reader(finput)):
to_append = "Filename" if i == 0 else filename
data.append(row+[to_append])
with open(filename,'wb') as foutput:
writer = csv.writer(foutput)
for row in data:
writer.writerow(row)
That may quote the data slightly differently from your input file, so you might want to play with the quoting options for csv.reader and csv.writer described in the documentation for the csv module.
As a further point, you might have good reasons for taking a glob as a parameter rather than just the files on the command line, but it's a bit surprising - you'll have to call your script as ./whatever.py '*.csv' rather than just ./whatever.py *.csv. Instead, you could just do:
for filename in sys.argv[1:]:
... and let the shell expand your glob before the script knows anything about it.
One last thing - the current approach you're taking is slightly dangerous, in that if anything fails when writing back to the same filename, you'll lose data. The standard way of avoiding this is to instead write to a temporary file, and, if that was successful, rename the temporary file over the original. So, you might rewrite the whole thing as:
import csv
import sys
import tempfile
import shutil
for filename in sys.argv[1:]:
tmp = tempfile.NamedTemporaryFile(delete=False)
with open(filename) as finput:
with open(tmp.name,'wb') as ftmp:
writer = csv.writer(ftmp)
for i, row in enumerate(csv.reader(finput)):
to_append = "Filename" if i == 0 else filename
writer.writerow(row+[to_append])
shutil.move(tmp.name,filename)
You can try:
data = [file.readline().rstrip() + ",id"]
data += [line.rstrip() + "," + filename for line in file]
You can try changing your code, but using the csv module is recommended. This should give you the result you want:
import sys
import glob
import csv
filename = glob.glob(sys.argv[1])[0]
yourfile = csv.reader(open(filename, 'rw'))
csv_output=[]
for row in yourfile:
if len(csv_output) != 0: # skip the header
row.append(filename)
csv_output.append(row)
yourfile = csv.writer(open(filename,'w'),delimiter=',')
yourfile.writerows(csv_output)
Use the CSV module that comes with Python.
import csv
import sys
def process_file(filename):
# Read the contents of the file into a list of lines.
f = open(filename, 'r')
contents = f.readlines()
f.close()
# Use a CSV reader to parse the contents.
reader = csv.reader(contents)
# Open the output and create a CSV writer for it.
f = open(filename, 'wb')
writer = csv.writer(f)
# Process the header.
header = reader.next()
header.append('ID')
writer.writerow(header)
# Process each row of the body.
for row in reader:
row.append(filename)
writer.writerow(row)
# Close the file and we're done.
f.close()
# Run the function on all command-line arguments. Note that this does no
# checking for things such as file existence or permissions.
map(process_file, sys.argv[1:])
You can run this as follows:
blair#blair-eeepc:~$ python csv_add_filename.py file1.csv file2.csv
you can use fileinput to do in place editing
import sys
import glob
import fileinput
for filename in glob.glob(sys.argv[1]):
for line in fileinput.FileInput(filename,inplace=1) :
if fileinput.lineno()==1:
print line.rstrip() + " ID"
else
print line.rstrip() + "," + filename