Find text with regular expression and replace in file

Find text with regular expression and replace in file - python

I would like to find text in file with regular expression and after replace it to another name. I have to read file line by line at first because in other way re.match(...) can`t find text.
My test file where I would like to make modyfications is (no all, I removed some code):
//...
#include <boost/test/included/unit_test.hpp>
#ifndef FUNCTIONS_TESTSUITE_H
#define FUNCTIONS_TESTSUITE_H
//...
BOOST_AUTO_TEST_SUITE(FunctionsTS)
BOOST_AUTO_TEST_CASE(test)
{
std::string l_dbConfigDataFileName = "../../Config/configDB.cfg";
DB::FUNCTIONS::DBConfigData l_dbConfigData;
//...
}
BOOST_AUTO_TEST_SUITE_END()
//...
Now python code which replace the configDB name to another. I have to find configDB.cfg name by regular expression because all the time the name is changing. Only the name, extension not needed.
Code:
import fileinput
import re
myfile = "Tset.cpp"
#first search expression - ok. working good find and print configDB
with open(myfile) as f:
for line in f:
matchObj = re.match( r'(.*)../Config/(.*).cfg(.*)', line, re.M|re.I)
if matchObj:
print "Search : ", matchObj.group(2)
#now replace searched expression to another name - so one more time find and replace - another way - not working - file after run this code is empty?!!!
for line in fileinput.FileInput(myfile, inplace=1):
matchObj = re.match( r'(.*)../Config/(.*).cfg(.*)', line, re.M|re.I)
if matchObj:
line = line.replace("Config","AnotherConfig")

From docs:
Optional in-place filtering: if the keyword argument inplace=1 is passed to fileinput.input() or to the FileInput constructor, the file is moved to a backup file and standard output is directed to the input file (if a file of the same name as the backup file already exists, it will be replaced silently).
What you need to do is just print line in every step of the loop. Also, you need to print line without additional newline, so you can use sys.stdout.write from sys module. As a result:
import fileinput
import re
import sys
...
for line in fileinput.FileInput(myfile, inplace=1):
matchObj = re.match( r'(.*)../Config/(.*).cfg(.*)', line, re.M|re.I)
if matchObj:
line = line.replace("Config","AnotherConfig")
sys.stdout.write(line)
ADDED:
Also I assume that you need to replace config.cfg to AnotherConfig.cfg. In this case, you can do something like this:
import fileinput
import re
import sys
myfile = "Tset.cpp"
regx = re.compile(r'(.*?\.\./Config/)(.*?)(\.cfg.*?)')
for line in fileinput.FileInput(myfile, inplace=1):
matchObj = regx.match(line, re.M|re.I)
if matchObj:
sys.stdout.write(regx.sub(r'\1AnotherConfig\3', line))
else:
sys.stdout.write(line)
You can read about function sub here: python docs.

If I'm understanding you, you want to change in the line:
std::string l_dbConfigDataFileName = "../../Config/configDB.cfg";
just the file name 'configBD' to some other file name and rewrite the file.
First, I would suggest writing to a new file and changing the file names in case something goes wrong. Rather than use re.match use re.sub if there is a match it will return the line altered if not it will return the line unaltered -- just write it to a new file. Then change the filenames -- the old file to .bck and the new file to the old file name.
import re
import os
regex = re.compile(r'(../config/)(config.*)(.cfg)', re.IGNORECASE)
oldF = 'find_config.cfg'
nwF = 'n_find_config.cfg'
bckF = 'find_confg.cfg.bck'
with open ( oldF, 'r' ) as f, open ( nwF, 'w' ) as nf :
lns = f.readlines()
for ln in lns:
nln = re.sub(regex, r'\1new_config\3', ln )
nf.write ( nln )
os.rename ( oldF, bckF )
os.rename ( nwF, oldF )

Related

Replace a text in File with python [duplicate]

I want to loop over the contents of a text file and do a search and replace on some lines and write the result back to the file. I could first load the whole file in memory and then write it back, but that probably is not the best way to do it.
What is the best way to do this, within the following code?
f = open(file)
for line in f:
if line.contains('foo'):
newline = line.replace('foo', 'bar')
# how to write this newline back to the file

The shortest way would probably be to use the fileinput module. For example, the following adds line numbers to a file, in-place:
import fileinput
for line in fileinput.input("test.txt", inplace=True):
print('{} {}'.format(fileinput.filelineno(), line), end='') # for Python 3
# print "%d: %s" % (fileinput.filelineno(), line), # for Python 2
What happens here is:
The original file is moved to a backup file
The standard output is redirected to the original file within the loop
Thus any print statements write back into the original file
fileinput has more bells and whistles. For example, it can be used to automatically operate on all files in sys.args[1:], without your having to iterate over them explicitly. Starting with Python 3.2 it also provides a convenient context manager for use in a with statement.
While fileinput is great for throwaway scripts, I would be wary of using it in real code because admittedly it's not very readable or familiar. In real (production) code it's worthwhile to spend just a few more lines of code to make the process explicit and thus make the code readable.
There are two options:
The file is not overly large, and you can just read it wholly to memory. Then close the file, reopen it in writing mode and write the modified contents back.
The file is too large to be stored in memory; you can move it over to a temporary file and open that, reading it line by line, writing back into the original file. Note that this requires twice the storage.

I guess something like this should do it. It basically writes the content to a new file and replaces the old file with the new file:
from tempfile import mkstemp
from shutil import move, copymode
from os import fdopen, remove
def replace(file_path, pattern, subst):
#Create temp file
fh, abs_path = mkstemp()
with fdopen(fh,'w') as new_file:
with open(file_path) as old_file:
for line in old_file:
new_file.write(line.replace(pattern, subst))
#Copy the file permissions from the old file to the new file
copymode(file_path, abs_path)
#Remove original file
remove(file_path)
#Move new file
move(abs_path, file_path)

Here's another example that was tested, and will match search & replace patterns:
import fileinput
import sys
def replaceAll(file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=1):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
sys.stdout.write(line)
Example use:
replaceAll("/fooBar.txt","Hello\sWorld!$","Goodbye\sWorld.")

This should work: (inplace editing)
import fileinput
# Does a list of files, and
# redirects STDOUT to the file in question
for line in fileinput.input(files, inplace = 1):
print line.replace("foo", "bar"),

Based on the answer by Thomas Watnedal.
However, this does not answer the line-to-line part of the original question exactly. The function can still replace on a line-to-line basis
This implementation replaces the file contents without using temporary files, as a consequence file permissions remain unchanged.
Also re.sub instead of replace, allows regex replacement instead of plain text replacement only.
Reading the file as a single string instead of line by line allows for multiline match and replacement.
import re
def replace(file, pattern, subst):
# Read contents from file as a single string
file_handle = open(file, 'r')
file_string = file_handle.read()
file_handle.close()
# Use RE package to allow for replacement (also allowing for (multiline) REGEX)
file_string = (re.sub(pattern, subst, file_string))
# Write contents to file.
# Using mode 'w' truncates the file.
file_handle = open(file, 'w')
file_handle.write(file_string)
file_handle.close()

As lassevk suggests, write out the new file as you go, here is some example code:
fin = open("a.txt")
fout = open("b.txt", "wt")
for line in fin:
fout.write( line.replace('foo', 'bar') )
fin.close()
fout.close()

If you're wanting a generic function that replaces any text with some other text, this is likely the best way to go, particularly if you're a fan of regex's:
import re
def replace( filePath, text, subs, flags=0 ):
with open( filePath, "r+" ) as file:
fileContents = file.read()
textPattern = re.compile( re.escape( text ), flags )
fileContents = textPattern.sub( subs, fileContents )
file.seek( 0 )
file.truncate()
file.write( fileContents )

A more pythonic way would be to use context managers like the code below:
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with open(target_file_path, 'w') as target_file:
with open(source_file_path, 'r') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)
You can find the full snippet here.

fileinput is quite straightforward as mentioned on previous answers:
import fileinput
def replace_in_file(file_path, search_text, new_text):
with fileinput.input(file_path, inplace=True) as file:
for line in file:
new_line = line.replace(search_text, new_text)
print(new_line, end='')
Explanation:
fileinput can accept multiple files, but I prefer to close each single file as soon as it is being processed. So placed single file_path in with statement.
print statement does not print anything when inplace=True, because STDOUT is being forwarded to the original file.
end='' in print statement is to eliminate intermediate blank new lines.
You can used it as follows:
file_path = '/path/to/my/file'
replace_in_file(file_path, 'old-text', 'new-text')

Create a new file, copy lines from the old to the new, and do the replacing before you write the lines to the new file.

Expanding on #Kiran's answer, which I agree is more succinct and Pythonic, this adds codecs to support the reading and writing of UTF-8:
import codecs
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with codecs.open(target_file_path, 'w', 'utf-8') as target_file:
with codecs.open(source_file_path, 'r', 'utf-8') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)

Using hamishmcn's answer as a template I was able to search for a line in a file that match my regex and replacing it with empty string.
import re
fin = open("in.txt", 'r') # in file
fout = open("out.txt", 'w') # out file
for line in fin:
p = re.compile('[-][0-9]*[.][0-9]*[,]|[-][0-9]*[,]') # pattern
newline = p.sub('',line) # replace matching strings with empty string
print newline
fout.write(newline)
fin.close()
fout.close()

if you remove the indent at the like below, it will search and replace in multiple line.
See below for example.
def replace(file, pattern, subst):
#Create temp file
fh, abs_path = mkstemp()
print fh, abs_path
new_file = open(abs_path,'w')
old_file = open(file)
for line in old_file:
new_file.write(line.replace(pattern, subst))
#close temp file
new_file.close()
close(fh)
old_file.close()
#Remove original file
remove(file)
#Move new file
move(abs_path, file)

How to edit a file in python 2.7.10?

I am trying to edit a file as follows in python 2.7.10 and running into below error, can anyone provide guidance on what the issue is on how to edit files?
import fileinput,re
filename = 'epivers.h'
text_to_search = re.compile("#define EPI_VERSION_STR \"(\d+\.\d+) (TOB) (r(\d+) ASSRT)\"")
replacement_text = "#define EPI_VERSION_STR \"9.130.27.50.1.2.3 (r749679 ASSRT)\""
with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
for line in file:
print(line.replace(text_to_search, replacement_text))
file.close()
Error:-
Traceback (most recent call last):
File "pythonfiledit.py", line 5, in <module>
with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
AttributeError: FileInput instance has no attribute '__exit__'
UPDATE:
import fileinput,re
import os
import shutil
import sys
import tempfile
filename = 'epivers.h'
text_to_search = re.compile("#define EPI_VERSION_STR \"(\d+\.\d+) (TOB) (r(\d+) ASSRT)\"")
replacement_text = "#define EPI_VERSION_STR \"9.130.27.50.1.2.3 (r749679 ASSRT)\""
with open(filename) as src, tempfile.NamedTemporaryFile(
'w', dir=os.path.dirname(filename), delete=False) as dst:
# Discard first line
for line in src:
if text_to_search.search(line):
# Save the new first line
line = text_to_search .sub(replacement_text,line)
dst.write(line + '\n')
dst.write(line)
# remove old version
os.unlink(filename)
# rename new version
os.rename(dst.name,filename)
I am trying to match line define EPI_VERSION_STR "9.130.27 (TOB) (r749679 ASSRT)"

If r is a compiled regular expression and line is a line of text, the way to apply the regex is
r.match(line)
to find a match at the beginning of line, or
r.search(line)
to find a match anywhere. In your particular case, you simply need
line = r.sub(replacement, line)
though in addition, you'll need to add a backslash before the round parentheses in your regex in order to match them literally (except in a few places where you apparently put in grouping parentheses around the \d+ for no particular reason; maybe just take those out).
Your example input string contains three digits, and the replacement string contains six digits, so \d+\.\d+ will never match either of those. I'm guessing you want something like \d+(?:\.\d+)+ or perhaps very bluntly [\d.]+ if the periods can be adjacent.
Furthermore, a single backslash in a string will be interpreted by Python, before it gets passed to the regex engine. You'll want to use raw strings around regexes, nearly always. For improved legibility, perhaps also prefer single quotes or triple double quotes over regular double quotes, so you don't have to backslash the double quotes within the regex.
Finally, your usage of fileinput is wrong. You can't use it as a context manager. Just loop over the lines which fileinput.input() returns.
import fileinput, re
filename = 'epivers.h'
text_to_search = re.compile(r'#define EPI_VERSION_STR "\d+(?:\.\d+)+ \(TOB\) \(r\d+ ASSRT\)"')
replacement_text = '#define EPI_VERSION_STR "9.130.27.50.1.2.3 (r749679 ASSRT)"'
for line in fileinput.input(filename, inplace=True, backup='.bak'):
print(text_to_search.sub(replacement_text, line))
In your first attempt, line.replace() was a good start, but it doesn't accept a regex argument (and of course, you don't close() a file you opened with with ...). In your second attempt, you are checking whether the line is identical to the regex, which of course it isn't (just like the string "two" isn't equivalent to the numeric constant 2).

Read the file, use re.sub to substitute, then write the new contents back:
with open(filename) as f:
text = f.read()
new_text = re.sub(r'#define EPI_VERSION_STR "\d+\(?:.\d+\)+ \(TOB\) \(r\d+ ASSRT\)"',
'#define EPI_VERSION_STR "9.130.27.50.1.2.3 (r749679 ASSRT)"',
text)
with open(filename, 'w') as f:
f.write(new_text)

search replace the string from number of .txt files in python

there are multiple files in directory with extension .txt, .dox, .qcr etc.
i need to list out txt files, search & replace the text from each txt files only.
need to search the $$\d ...where \d stands for the digit 1,2,3.....100.
need to replace with xxx.
please let me know the python script for this .
thanks in advance .
-Shrinivas
#created following script, it works for single txt files, but it is not working for txt files more than one lies in directory.
-----
def replaceAll(file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=1):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
sys.stdout.write(line)
#following code is not working, i expect to list out the files start #with "um_*.txt", open the file & replace the "$$\d" with replaceAll function.
for um_file in glob.glob('*.txt'):
t = open(um_file, 'r')
replaceAll("t.read","$$\d","xxx")
t.close()

fileinput.input(...) is supposed to process a bunch of files, and must be ended with a corresponding fileinput.close(). So you can either process all in one single call:
def replaceAll(file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=True):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
dummy = sys.stdout.write(line) # to avoid a possible output of the size
fileinput.close() # to orderly close everythin
replaceAll(glob.glob('*.txt'), "$$\d","xxx")
or consistently close fileinput after processing each file, but it rather ignores the main fileinput feature.

Try out this.
import re
def replaceAll(file,searchExp,replaceExp):
for line in file.readlines():
try:
line = line.replace(re.findall(searchExp,line)[0],replaceExp)
except:
pass
sys.stdout.write(line)
#following code is not working, i expect to list out the files start #with "um_*.txt", open the file & replace the "$$\d" with replaceAll function.
for um_file in glob.glob('*.txt'):
t = open(um_file, 'r')
replaceAll(t,"\d+","xxx")
t.close()
Here we are sending file handler to the replaceAll function rather than a string.

You can try this:
import os
import re
the_files = [i for i in os.listdir("foldername") if i.endswith("txt")]
for file in the_files:
new_data = re.sub("\d+", "xxx", open(file).read())
final_file = open(file, 'w')
final_file.write(new_data)
final_file.close()

How to use the regex to parse the entire file and determine the matches were found , rather then reading each line by line?

Instead on reading each and every line cant we just search for the string in the file and replace it... i am trying but unable to get any idea how to do thth?
file = open(C:\\path.txt,"r+")
lines = file.readlines()
replaceDone=0
file.seek(0)
newString="set:: windows32\n"
for l in lines:
if (re.search("(address_num)",l,re.I) replaceDone==0:
try:
file.write(l.replace(l,newString))
replaceDone=1
except IOError:
file.close()

Here's an example you can adapt that replaces every sequence of '(address_num)' with 'set:: windows32' for a file:
import fileinput
import re
for line in fileinput.input('/home/jon/data.txt', inplace=True):
print re.sub('address_num', 'set:: windows32', line, flags=re.I),

This is not very memory efficient but I guess it is what you are looking for:
import re
text = open(file_path, 'r').read()
open(file_path, 'w').write(re.sub(old_string, new_string, text))
Read the whole file, replace and write back the whole file.

Search and replace a line in a file in Python

I want to loop over the contents of a text file and do a search and replace on some lines and write the result back to the file. I could first load the whole file in memory and then write it back, but that probably is not the best way to do it.
What is the best way to do this, within the following code?
f = open(file)
for line in f:
if line.contains('foo'):
newline = line.replace('foo', 'bar')
# how to write this newline back to the file

The shortest way would probably be to use the fileinput module. For example, the following adds line numbers to a file, in-place:
import fileinput
for line in fileinput.input("test.txt", inplace=True):
print('{} {}'.format(fileinput.filelineno(), line), end='') # for Python 3
# print "%d: %s" % (fileinput.filelineno(), line), # for Python 2
What happens here is:
The original file is moved to a backup file
The standard output is redirected to the original file within the loop
Thus any print statements write back into the original file
fileinput has more bells and whistles. For example, it can be used to automatically operate on all files in sys.args[1:], without your having to iterate over them explicitly. Starting with Python 3.2 it also provides a convenient context manager for use in a with statement.
While fileinput is great for throwaway scripts, I would be wary of using it in real code because admittedly it's not very readable or familiar. In real (production) code it's worthwhile to spend just a few more lines of code to make the process explicit and thus make the code readable.
There are two options:
The file is not overly large, and you can just read it wholly to memory. Then close the file, reopen it in writing mode and write the modified contents back.
The file is too large to be stored in memory; you can move it over to a temporary file and open that, reading it line by line, writing back into the original file. Note that this requires twice the storage.

I guess something like this should do it. It basically writes the content to a new file and replaces the old file with the new file:
from tempfile import mkstemp
from shutil import move, copymode
from os import fdopen, remove
def replace(file_path, pattern, subst):
#Create temp file
fh, abs_path = mkstemp()
with fdopen(fh,'w') as new_file:
with open(file_path) as old_file:
for line in old_file:
new_file.write(line.replace(pattern, subst))
#Copy the file permissions from the old file to the new file
copymode(file_path, abs_path)
#Remove original file
remove(file_path)
#Move new file
move(abs_path, file_path)

Here's another example that was tested, and will match search & replace patterns:
import fileinput
import sys
def replaceAll(file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=1):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
sys.stdout.write(line)
Example use:
replaceAll("/fooBar.txt","Hello\sWorld!$","Goodbye\sWorld.")

This should work: (inplace editing)
import fileinput
# Does a list of files, and
# redirects STDOUT to the file in question
for line in fileinput.input(files, inplace = 1):
print line.replace("foo", "bar"),

Based on the answer by Thomas Watnedal.
However, this does not answer the line-to-line part of the original question exactly. The function can still replace on a line-to-line basis
This implementation replaces the file contents without using temporary files, as a consequence file permissions remain unchanged.
Also re.sub instead of replace, allows regex replacement instead of plain text replacement only.
Reading the file as a single string instead of line by line allows for multiline match and replacement.
import re
def replace(file, pattern, subst):
# Read contents from file as a single string
file_handle = open(file, 'r')
file_string = file_handle.read()
file_handle.close()
# Use RE package to allow for replacement (also allowing for (multiline) REGEX)
file_string = (re.sub(pattern, subst, file_string))
# Write contents to file.
# Using mode 'w' truncates the file.
file_handle = open(file, 'w')
file_handle.write(file_string)
file_handle.close()

As lassevk suggests, write out the new file as you go, here is some example code:
fin = open("a.txt")
fout = open("b.txt", "wt")
for line in fin:
fout.write( line.replace('foo', 'bar') )
fin.close()
fout.close()

If you're wanting a generic function that replaces any text with some other text, this is likely the best way to go, particularly if you're a fan of regex's:
import re
def replace( filePath, text, subs, flags=0 ):
with open( filePath, "r+" ) as file:
fileContents = file.read()
textPattern = re.compile( re.escape( text ), flags )
fileContents = textPattern.sub( subs, fileContents )
file.seek( 0 )
file.truncate()
file.write( fileContents )

A more pythonic way would be to use context managers like the code below:
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with open(target_file_path, 'w') as target_file:
with open(source_file_path, 'r') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)
You can find the full snippet here.

fileinput is quite straightforward as mentioned on previous answers:
import fileinput
def replace_in_file(file_path, search_text, new_text):
with fileinput.input(file_path, inplace=True) as file:
for line in file:
new_line = line.replace(search_text, new_text)
print(new_line, end='')
Explanation:
fileinput can accept multiple files, but I prefer to close each single file as soon as it is being processed. So placed single file_path in with statement.
print statement does not print anything when inplace=True, because STDOUT is being forwarded to the original file.
end='' in print statement is to eliminate intermediate blank new lines.
You can used it as follows:
file_path = '/path/to/my/file'
replace_in_file(file_path, 'old-text', 'new-text')

Create a new file, copy lines from the old to the new, and do the replacing before you write the lines to the new file.

Expanding on #Kiran's answer, which I agree is more succinct and Pythonic, this adds codecs to support the reading and writing of UTF-8:
import codecs
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with codecs.open(target_file_path, 'w', 'utf-8') as target_file:
with codecs.open(source_file_path, 'r', 'utf-8') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)

Using hamishmcn's answer as a template I was able to search for a line in a file that match my regex and replacing it with empty string.
import re
fin = open("in.txt", 'r') # in file
fout = open("out.txt", 'w') # out file
for line in fin:
p = re.compile('[-][0-9]*[.][0-9]*[,]|[-][0-9]*[,]') # pattern
newline = p.sub('',line) # replace matching strings with empty string
print newline
fout.write(newline)
fin.close()
fout.close()

if you remove the indent at the like below, it will search and replace in multiple line.
See below for example.
def replace(file, pattern, subst):
#Create temp file
fh, abs_path = mkstemp()
print fh, abs_path
new_file = open(abs_path,'w')
old_file = open(file)
for line in old_file:
new_file.write(line.replace(pattern, subst))
#close temp file
new_file.close()
close(fh)
old_file.close()
#Remove original file
remove(file)
#Move new file
move(abs_path, file)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find text with regular expression and replace in file - python

Related

Replace a text in File with python [duplicate]

How to edit a file in python 2.7.10?

search replace the string from number of .txt files in python

How to use the regex to parse the entire file and determine the matches were found , rather then reading each line by line?

Search and replace a line in a file in Python

Categories

Resources