Is there a method of creating a text file without opening a text file in "w" or "a" mode? For instance If I wanted to open a file in "r" mode but the file does not exist then when I catch IOError I want a new file to be created
e.g.:
while flag == True:
try:
# opening src in a+ mode will allow me to read and append to file
with open("Class {0} data.txt".format(classNo),"r") as src:
# list containing all data from file, one line is one item in list
data = src.readlines()
for ind,line in enumerate(data):
if surname.lower() and firstName.lower() in line.lower():
# overwrite the relevant item in data with the updated score
data[ind] = "{0} {1}\n".format(line.rstrip(),score)
rewrite = True
else:
with open("Class {0} data.txt".format(classNo),"a") as src:
src.write("{0},{1} : {2}{3} ".format(surname, firstName, score,"\n"))
if rewrite == True:
# reopen src in write mode and overwrite all the records with the items in data
with open("Class {} data.txt".format(classNo),"w") as src:
src.writelines(data)
flag = False
except IOError:
print("New data file created")
# Here I want a new file to be created and assigned to the variable src so when the
# while loop iterates for the second time the file should successfully open
At the beginning just check if the file exists and create it if it doesn't:
filename = "Class {0} data.txt"
if not os.path.isfile(filename):
open(filename, 'w').close()
From this point on you can assume the file exists, this will greatly simplify your code.
No operating system will allow you to create a file without actually writing to it. You can encapsulate this in a library so that the creation is not visible, but it is impossible to avoid writing to the file system if you really want to modify the file system.
Here is a quick and dirty open replacement which does what you propose.
def open_for_reading_create_if_missing(filename):
try:
handle = open(filename, 'r')
except IOError:
with open(filename, 'w') as f:
pass
handle = open(filename, 'r')
return handle
Better would be to create the file if it doesn't exist, e.g. Something like:
import sys, os
def ensure_file_exists(file_name):
""" Make sure that I file with the given name exists """
(the_dir, fname) = os.path.split(file_name)
if not os.path.exists(the_dir):
sys.mkdirs(the_dir) # This may give an exception if the directory cannot be made.
if not os.path.exists(file_name):
open(file_name, 'w').close()
You could even have a safe_open function that did something similar prior to opening for read and returning the file handle.
The sample code provided in the question is not very clear, specially because it invokes multiple variables that are not defined anywhere. But based on it here is my suggestion. You can create a function similar to touch + file open, but which will be platform agnostic.
def touch_open( filename):
try:
connect = open( filename, "r")
except IOError:
connect = open( filename, "a")
connect.close()
connect = open( filename, "r")
return connect
This function will open the file for you if it exists. If the file doesn't exist it will create a blank file with the same name and the open it. An additional bonus functionality with respect to import os; os.system('touch test.txt') is that it does not create a child process in the shell making it faster.
Since it doesn't use the with open(filename) as src syntax you should either remember to close the connection at the end with connection = touch_open( filename); connection.close() or preferably you could open it in a for loop. Example:
file2open = "test.txt"
for i, row in enumerate( touch_open( file2open)):
print i, row, # print the line number and content
This option should be preferred to data = src.readlines() followed by enumerate( data), found in your code, because it avoids looping twice through the file.
Related
I am try to create some temporal files and make some operations on them inside a loop. Then I will access the information on all of the temporal files. And do some operations with that information. For simplicity I brought the following code that reproduces my issue:
import tempfile
tmp_files = []
for i in range(40):
tmp = tempfile.NamedTemporaryFile(suffix=".txt")
with open(tmp.name, "w") as f:
f.write(str(i))
tmp_files.append(tmp.name)
string = ""
for tmp_file in tmp_files:
with open(tmp_file, "r") as f:
data = f.read()
string += data
print(string)
ERROR:
with open(tmp_file, "r") as f: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpynh0kbnw.txt'
When I look on /tmp directory (with some time.sleep(2) on the loop) I see that the file is deleted and only one is preserved. And for that the error.
Of course I could handle to keep all the files with the flag tempfile.NamedTemporaryFile(suffix=".txt", delete=False). But that is not the idea. I would like to hold the temporal files just for the running time of the script. I also could delete the files with os.remove. But my question is more why this happen. Because I expected that the files hold to the end of the running. Because I don't close the file on the execution (or do I?).
A lot of thanks in advance.
tdelaney does already answer your actual question.
I just would like to offer you an alternative to NamedTemporaryFile. Why not creating a temporary folder which is removed (with all files in it) at the end of the script?
Instead of using a NamedTemporaryFile, you could use tempfile.TemporaryDirectory. The directory will be deleted when closed.
The example below uses the with statement which closes the file handle automatically when the block ends (see John Gordon's comment).
import os
import tempfile
with tempfile.TemporaryDirectory() as temp_folder:
tmp_files = []
for i in range(40):
tmp_file = os.path.join(temp_folder, f"{i}.txt")
with open(tmp_file, "w") as f:
f.write(str(i))
tmp_files.append(tmp_file)
string = ""
for tmp_file in tmp_files:
with open(tmp_file, "r") as f:
data = f.read()
string += data
print(string)
By default, a NamedTemporaryFile deletes its file when closed. its a bit subtle, but tmp = tempfile.NamedTemporaryFile(suffix=".txt") in the loop causes the previous file to be deleted when tmp is reassigned. One option is to use the delete=False parameter. Or, just keep the file open and seek to the beginning after the write.
NamedTemporaryFile is already a file object - you can write to it directly without reopening. Just make sure the mode is "write plus" and in text, not binary mode. Put the code an a try/finally block to make sure the files are really deleted at the end.
import tempfile
tmp_files = []
try:
for i in range(40):
tmp = tempfile.NamedTemporaryFile(suffix=".txt", mode="w+")
tmp.write(str(i))
tmp.seek(0)
tmp_files.append(tmp)
string = ""
for tmp_file in tmp_files:
data = tmp_file.read()
string += data
finally:
for tmp_file in tmp_files:
tmp_file.close()
print(string)
I made this program it takes user input and outputs it in a new text file.
output = input('Insert your text')
f = open("text.txt", "a")
f.write(output)
This code will take a users input and prints it in a new text file. But if the file already exists in the path, the python code will just append to the file. I want the code to create a new file in the path every time the program is run. So the first time the code is run it will be displayed as text.txt, and the second time it runs it should output a new file called text(1).txt and so on.
Start by checking if test.txt exists. If it does, with a loop, check for test(n).txt, with n being some positive integer, starting at 1.
from os.path import isfile
output = input('Insert your text')
newFileName = "text.txt"
i = 1
while isfile(newFileName):
newFileName = "text({}).txt".format(i)
i += 1
f = open(newFileName, "w")
f.write(output)
f.close()
Eventually, the loop will reach some n, for which the filename test(n).txt doesn't exist and will save the file with that name.
Check if the file you are trying to create already exists. If yes, then change the file name, else write text to the file.
import os
output = input('Insert your text ')
filename = 'text.txt'
i = 1
while os.path.exists(filename):
filename = 'text ('+str(i)+').txt'
i += 1
f = open(filename, "a")
f.write(output)
Check if file already exists
import os.path
os.path.exists('filename-here.txt')
If file exists then create file with another filename (eg - appending the filename with date & time or any number etc)
A problem with checking for existence is that there can be a race condition if two processes try to create the same file:
process 1: does file exist? (no)
process 2: does file exist? (no)
process 2: create file for writing ('w', which truncates if it exists)
process 2: write file.
process 2: close file.
process 1: create same file for writing ('w', which truncates process 2's file).
A way around this is mode 'x' (open for exclusive creation, failing if the file already exists), but in the scenario above that would just make process 1 get an error instead of truncating process 2's file.
To open the file with an incrementing filename as the OP described, this can be used:
import os
def unique_open(filename):
# "name" contains everything up to the extension.
# "ext" contains the last dot(.) and extension, if any
name,ext = os.path.splitext(filename)
n = 0
while True:
try:
return open(filename,'x')
except FileExistsError:
n += 1
# build new filename with incrementing number
filename = f'{name}({n}){ext}'
file = unique_open('test.txt')
file.write('content')
file.close()
To make the function work with a context manager ("with" statement), a contextlib.contextmanager can be used to decorate the function and provide automatic .close() of the file:
import os
import contextlib
#contextlib.contextmanager
def unique_open(filename):
n = 0
name,ext = os.path.splitext(filename)
try:
while True:
try:
file = open(filename,'x')
except FileExistsError:
n += 1
filename = f'{name}({n}){ext}'
else:
print(f'opened {filename}') # for debugging
yield file # value of with's "as".
break # open succeeded, so exit while
finally:
file.close() # cleanup when with block exits
with unique_open('test.txt') as f:
f.write('content')
Demo:
C:\>test.py
opened test.txt
C:\>test
opened test(1).txt
C:\>test
opened test(2).txt
I'm using Python, and would like to insert a string into a text file without deleting or copying the file. How can I do that?
Unfortunately there is no way to insert into the middle of a file without re-writing it. As previous posters have indicated, you can append to a file or overwrite part of it using seek but if you want to add stuff at the beginning or the middle, you'll have to rewrite it.
This is an operating system thing, not a Python thing. It is the same in all languages.
What I usually do is read from the file, make the modifications and write it out to a new file called myfile.txt.tmp or something like that. This is better than reading the whole file into memory because the file may be too large for that. Once the temporary file is completed, I rename it the same as the original file.
This is a good, safe way to do it because if the file write crashes or aborts for any reason, you still have your untouched original file.
Depends on what you want to do. To append you can open it with "a":
with open("foo.txt", "a") as f:
f.write("new line\n")
If you want to preprend something you have to read from the file first:
with open("foo.txt", "r+") as f:
old = f.read() # read everything in the file
f.seek(0) # rewind
f.write("new line\n" + old) # write the new line before
The fileinput module of the Python standard library will rewrite a file inplace if you use the inplace=1 parameter:
import sys
import fileinput
# replace all occurrences of 'sit' with 'SIT' and insert a line after the 5th
for i, line in enumerate(fileinput.input('lorem_ipsum.txt', inplace=1)):
sys.stdout.write(line.replace('sit', 'SIT')) # replace 'sit' and write
if i == 4: sys.stdout.write('\n') # write a blank line after the 5th line
Rewriting a file in place is often done by saving the old copy with a modified name. Unix folks add a ~ to mark the old one. Windows folks do all kinds of things -- add .bak or .old -- or rename the file entirely or put the ~ on the front of the name.
import shutil
shutil.move(afile, afile + "~")
destination= open(aFile, "w")
source= open(aFile + "~", "r")
for line in source:
destination.write(line)
if <some condition>:
destination.write(<some additional line> + "\n")
source.close()
destination.close()
Instead of shutil, you can use the following.
import os
os.rename(aFile, aFile + "~")
Python's mmap module will allow you to insert into a file. The following sample shows how it can be done in Unix (Windows mmap may be different). Note that this does not handle all error conditions and you might corrupt or lose the original file. Also, this won't handle unicode strings.
import os
from mmap import mmap
def insert(filename, str, pos):
if len(str) < 1:
# nothing to insert
return
f = open(filename, 'r+')
m = mmap(f.fileno(), os.path.getsize(filename))
origSize = m.size()
# or this could be an error
if pos > origSize:
pos = origSize
elif pos < 0:
pos = 0
m.resize(origSize + len(str))
m[pos+len(str):] = m[pos:origSize]
m[pos:pos+len(str)] = str
m.close()
f.close()
It is also possible to do this without mmap with files opened in 'r+' mode, but it is less convenient and less efficient as you'd have to read and temporarily store the contents of the file from the insertion position to EOF - which might be huge.
As mentioned by Adam you have to take your system limitations into consideration before you can decide on approach whether you have enough memory to read it all into memory replace parts of it and re-write it.
If you're dealing with a small file or have no memory issues this might help:
Option 1)
Read entire file into memory, do a regex substitution on the entire or part of the line and replace it with that line plus the extra line. You will need to make sure that the 'middle line' is unique in the file or if you have timestamps on each line this should be pretty reliable.
# open file with r+b (allow write and binary mode)
f = open("file.log", 'r+b')
# read entire content of file into memory
f_content = f.read()
# basically match middle line and replace it with itself and the extra line
f_content = re.sub(r'(middle line)', r'\1\nnew line', f_content)
# return pointer to top of file so we can re-write the content with replaced string
f.seek(0)
# clear file content
f.truncate()
# re-write the content with the updated content
f.write(f_content)
# close file
f.close()
Option 2)
Figure out middle line, and replace it with that line plus the extra line.
# open file with r+b (allow write and binary mode)
f = open("file.log" , 'r+b')
# get array of lines
f_content = f.readlines()
# get middle line
middle_line = len(f_content)/2
# overwrite middle line
f_content[middle_line] += "\nnew line"
# return pointer to top of file so we can re-write the content with replaced string
f.seek(0)
# clear file content
f.truncate()
# re-write the content with the updated content
f.write(''.join(f_content))
# close file
f.close()
Wrote a small class for doing this cleanly.
import tempfile
class FileModifierError(Exception):
pass
class FileModifier(object):
def __init__(self, fname):
self.__write_dict = {}
self.__filename = fname
self.__tempfile = tempfile.TemporaryFile()
with open(fname, 'rb') as fp:
for line in fp:
self.__tempfile.write(line)
self.__tempfile.seek(0)
def write(self, s, line_number = 'END'):
if line_number != 'END' and not isinstance(line_number, (int, float)):
raise FileModifierError("Line number %s is not a valid number" % line_number)
try:
self.__write_dict[line_number].append(s)
except KeyError:
self.__write_dict[line_number] = [s]
def writeline(self, s, line_number = 'END'):
self.write('%s\n' % s, line_number)
def writelines(self, s, line_number = 'END'):
for ln in s:
self.writeline(s, line_number)
def __popline(self, index, fp):
try:
ilines = self.__write_dict.pop(index)
for line in ilines:
fp.write(line)
except KeyError:
pass
def close(self):
self.__exit__(None, None, None)
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
with open(self.__filename,'w') as fp:
for index, line in enumerate(self.__tempfile.readlines()):
self.__popline(index, fp)
fp.write(line)
for index in sorted(self.__write_dict):
for line in self.__write_dict[index]:
fp.write(line)
self.__tempfile.close()
Then you can use it this way:
with FileModifier(filename) as fp:
fp.writeline("String 1", 0)
fp.writeline("String 2", 20)
fp.writeline("String 3") # To write at the end of the file
If you know some unix you could try the following:
Notes: $ means the command prompt
Say you have a file my_data.txt with content as such:
$ cat my_data.txt
This is a data file
with all of my data in it.
Then using the os module you can use the usual sed commands
import os
# Identifiers used are:
my_data_file = "my_data.txt"
command = "sed -i 's/all/none/' my_data.txt"
# Execute the command
os.system(command)
If you aren't aware of sed, check it out, it is extremely useful.
I'm trying to change a lot of pdf-files. Because of this I must open a lot of files. I use the method open to many times. So python gives the error too many open files.
I hope my code is grace.writer many too similar
readerbanner = PyPDF2.pdf.PdfFileReader(open('transafe.pdf', 'rb'))
readertestpages = PyPDF2.pdf.PdfFileReader(open(os.path.join(Cache_path, cache_file_name), 'rb'))
writeroutput.write(open(os.path.join(Output_path,cache_file_name), 'wb'))
or
writer_output.write(open(os.path.join(Cache_path, str(NumPage) + "_" + pdf_file_name), 'wb'))
reader_page_x = PyPDF2.pdf.PdfFileReader(open(os.path.join(PDF_path, pdf_file_name), 'rb'))
All the open methods do not use f_name = open("path","r").
because all open file have period. I know the position but not know how close all open files.
To close a file just call close() on it.
You can also use a context manager which handles file closing for you:
with open('file.txt') as myfile:
# do something with myfile here
# here myfile is automatically closed
As far as i know, this code should not open too many files. Unless it is run a lot of times.
Regardless, the problem consists of you calling:
PyPDF2.pdf.PdfFileReader(open('transafe.pdf', 'rb'))
and similar. This creates a file object, but saves no reference to it.
What you need to do for all open calls is:
file = open('transafe.pdf', 'rb')
PyPDF2.pdf.PdfFileReader(file)
And then:
file.close()
when you do not use the file anymore.
If you want to close many files at the same time, put them in a list.
with statement
with open("abc.txt", "r") as file1, open("123.txt", "r") as file2:
# use files
foo = file1.readlines()
# they are closed automatically
print(file1.closed)
# -> True
print(file2.closed)
# -> True
wrapper function
files = []
def my_open(*args):
f = open(*args)
files.append(f)
return f
# use my_open
foo = my_open("text.txt", "r")
# close all files
list(map(lambda f: f.close(), files))
wrapper class
class OpenFiles():
def __init__(self):
self.files = []
def open(self, *args):
f = open(*args)
self.files.append(f)
return f
def close(self):
list(map(lambda f: f.close(), self.files))
files = OpenFiles()
# use open method
foo = files.open("text.txt", "r")
# close all files
files.close()
ExitStack can be useful:
https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack
with ExitStack() as stack:
files = [stack.enter_context(open(fname)) for fname in filenames]
# All opened files will automatically be closed at the end of
# the with statement, even if attempts to open files later
# in the list raise an exception
I am creating a log file with line by line records.
1- If file does not exist, it should create file and append header row and the record
2- if it exists, check the text timeStamp in first line. If it exist then append the record otherwise add header columns and record itself
I tried both w,a and r+; nothing worked for me. Below is my code:
logFile = open('Dump.log', 'r+')
datalogFile = log.readline()
if 'Timestamp' in datalogFile:
logFile.write('%s\t%s\t%s\t%s\t\n'%(timestamp,logread,logwrite,log_skipped_noweight))
logFile.flush()
else:
logFile.write('Timestamp\t#Read\t#Write\t#e\n')
logFile.flush()
logFile.write('%s\t%s\t%s\t%s\t\n'%(timestamp,logread,logwrite,log_skipped))
logFile.flush()
Code fails if file don't exist
Use 'a+' mode:
logFile = open('Dump.log', 'a+')
description:
a+
Open for reading and writing. The file is created if it does not
exist. The stream is positioned at the end of the file. Subsequent
writes to the file will always end up at the then current
end of file, irrespective of any intervening fseek(3) or similar
Following code would work:
import os
f = open('myfile', 'ab+') #you can use a+ if it's not binary
f.seek(0, os.SEEK_SET)
print f.readline() #print the first line
f.close()
Try this:
import os
if os.path.exists(my_file):
print 'file does not exist'
# some processing
else:
print 'file exists'
# some processing
You're opening the file in r+ mode which means you assume the file exists. Also, if you intend the write on the file, you should open it with a+ mode (unashamedly stealing ndpu's explanation)
Your code would become:
logFileDetails = []
with open("Dump.log","a+") as logFile:
logFileDetails = logFile.readLines()
if logFileDetails and "Timestamp" in logFileDetails:
pass # File exists, write your stuff here
else:
pass # Log file doesn't exist, write timestamp here
Checking a file existence introduces a race condition, i.e. another process can create it or delete it after the check returns false or true, respectively, creating heavy bugs. You should instead use:
if open('path\to.filename', 'a+') != '':
stuff_if_exists
else:
stuff_if_not_exists