I am trying to find a line starts with specific string and replace entire line with new string
I tried this code
filename = "settings.txt"
for line in fileinput.input(filename, inplace=True):
print line.replace('BASE_URI =', 'BASE_URI = "http://example.net"')
This one not replacing entire line but just a matching string. what is best way to replace entire line starting with string ?
You don't need to know what old is; just redefine the entire line:
import sys
import fileinput
for line in fileinput.input([filename], inplace=True):
if line.strip().startswith('BASE_URI ='):
line = 'BASE_URI = "http://example.net"\n'
sys.stdout.write(line)
Are you using the python 2 syntax. Since python 2 is discontinued, I will try to solve this in python 3 syntax
suppose you need to replace lines that start with "Hello" to "Not Found" then you can do is
lines = open("settings.txt").readlines()
newlines = []
for line in lines:
if not line.startswith("Hello"):
newlines.append(line)
else:
newlines.append("Not Found")
with open("settings.txt", "w+") as fh:
for line in newlines:
fh.write(line+"\n")
This should do the trick:
def replace_line(source, destination, starts_with, replacement):
# Open file path
with open(source) as s_file:
# Store all file lines in lines
lines = s_file.readlines()
# Iterate lines
for i in range(len(lines)):
# If a line starts with given string
if lines[i].startswith(starts_with):
# Replace whole line and use current line separator (last character (-1))
lines[i] = replacement + lines[-1]
# Open destination file and write modified lines list into it
with open(destination, "w") as d_file:
d_file.writelines(lines)
Call it using this parameters:
replace_line("settings.txt", "settings.txt", 'BASE_URI =', 'BASE_URI = "http://example.net"')
Cheers!
Related
I want to read a text file which contains list of file names and append that to a list.
I have written the code, but the output is repeated twice.
For example, there is text file named filelist.txt which contains the list of files (ex:- mango.txt).
When I run the code its printing mango.txt twice.
def im_ru(filelist1):
with open(filelist1,"r") as fp:
lines = fp.read().splitlines()
for line in lines:
print(lines)
Print line instead
for line in lines:
print(line)
It's a one-liner:
from pathlib import Path
filenames = Path("filelist.txt").read_text().splitlines()
You error is to print lines instead of line in your for loop
Use:
def im_ru(filelist1):
with open(filelist1, 'r') as fp:
# You have to remove manually the end line character
lines = [line.strip() for line in fp]
return lines
Usage:
>>> im_ru('data.txt')
['mango.txt', 'hello.csv']
Content of data.txt:
mango.txt
hello.csv
I have input.txt file and output.txt file which are passed in argument in Python script. I am reading input file content using readline() function. Before I update to current line and write it to output file, I want to check some condition on upcoming lines as described below. Could you please provide me some guidance? Thank you.
I want to update current line with internal_account value (random number with 16 digits) from 11th location if line starts with 01065008 and following condition are met.
5th upcoming line starts with 06 and
line start with 06 has value as USD from 6th character
input.txt
01065008200520P629658405456454
02BRYAN ANGUS 56425555643
0300000000000000000HUTS7858863
04PROSPECTUS ENCLOSYUSS574U623
05AS OF 05/13/20 45452366753
06Q47USDTFT 87845566765
input.txt file has pattern:
1st line will start with 010065008
2nd line will start with 02
...
6th line will start with 06
1st line will start with 010065008
...
What I have tried?
import random
import sys
infile=open(sys.argv[1], 'r')
lines=infile.readlines()
outfile=open(sys.argv[2], 'w')
internal_account = random.randint(1000000000000000,9999999999999999)
formattedStr = ''
for line in lines:
if line[0:8] == '01065008':
formattedStr='%s%s%s'%(line[0:10],internal_account,line[26:])
outfile.write(formattedStr)
else:
outfile.write(line)
outfile.close()
To check forward in the text file, read all the lines into a list then use the line index to check forward lines. Use the enumerate function to the track the line index.
ss = '''
01065008200520P629658405456454
02BRYAN ANGUS 56425555643
0300000000000000000HUTS7858863
04PROSPECTUS ENCLOSYUSS574U623
05AS OF 05/13/20 45452366753
06Q47USDTFT 87845566765
'''.strip()
with open ('input.txt','w') as f: f.write(ss) # write data file
###############################3
import random
import sys
infile=open('input.txt') #open(sys.argv[1], 'r')
lines=infile.readlines()
outfile=open('output.txt','w') #open(sys.argv[2], 'w')
internal_account = random.randint(1000000000000000,9999999999999999)
print('internal_account', internal_account, end='\n\n')
formattedStr = ''
for i,line in enumerate(lines):
line
if line[0:8] == '01065008' and i < len(lines)-5 and lines[i+5].startswith('06') and lines[i+5][5:8] == 'USD':
formattedStr='%s%s%s'%(line[0:10],internal_account,line[26:])
outfile.write(formattedStr)
print(formattedStr.strip())
else:
outfile.write(line)
print(line.strip())
outfile.close()
Output
internal_account 2371299802657810
010650082023712998026578106454
02BRYAN ANGUS 56425555643
0300000000000000000HUTS7858863
04PROSPECTUS ENCLOSYUSS574U623
05AS OF 05/13/20 45452366753
06Q47USDTFT 87845566765
You were not far from finding a good solution. Using enumerate on input lines let use use the index to check future lines so you can verify if all your conditions are fulfilled. You need to catch IndexError so that no exception is raised when there are not enough lines left.
Other minor modifications I made in your code:
Use with statement to handle file opening to prevent having to close file yourself.
Use startswith wherever you can to make the code clearer.
Use scientific notation when you can to make code clearer.
import random
import sys
input_file, output_file = sys.argv[0:2]
internal_account = random.randint(1e15, 9999999999999999)
with open(input_file, "r") as stream:
input_lines = stream.readlines()
with open(output_file, "w") as stream:
for index, line in enumerate(input_lines):
try:
update_account = (
line.startswith("01065008")
and input_lines[index + 5].startswith("06")
and input_lines[index + 5][5:8] == "USD"
)
except IndexError:
update_account = False
if update_account:
line = line[0:10] + str(internal_account) + line[26:]
stream.write(line)
I am trying to extract data from a .txt file in Python. My goal is to capture the last occurrence of a certain word and show the next line, so I do a reverse () of the text and read from behind. In this case, I search for the word 'MEC', and show the next line, but I capture all occurrences of the word, not the first.
Any idea what I need to do?
Thanks!
This is what my code looks like:
import re
from file_read_backwards import FileReadBackwards
with FileReadBackwards("camdex.txt", encoding="utf-8") as file:
for l in file:
lines = l
while line:
if re.match('MEC', line):
x = (file.readline())
x2 = (x.strip('\n'))
print(x2)
break
line = file.readline()
The txt file contains this:
MEC
29/35
MEC
28,29/35
And with my code print this output:
28,29/35
29/35
And my objetive is print only this:
28,29/35
This will give you the result as well. Loop through lines, add the matching lines to an array. Then print the last element.
import re
with open("data\camdex.txt", encoding="utf-8") as file:
result = []
for line in file:
if re.match('MEC', line):
x = file.readline()
result.append(x.strip('\n'))
print(result[-1])
Get rid of the extra imports and overhead. Read your file normally, remembering the last line that qualifies.
with ("camdex.txt", encoding="utf-8") as file:
for line in file:
if line.startswith("MEC"):
last = line
print(last[4:-1]) # "4" gets rid of "MEC "; "-1" stops just before the line feed.
If the file is very large, then reading backwards makes sense -- seeking to the end and backing up will be faster than reading to the end.
I want to search a line and replace a line in a file.txt,
CURRENT_BUILD = 111111
with
CURRENT_BUILD = 221111
Using Python.
You can iterate through the lines in the file, saving them to a list and then output the list back to the file, line by line, replacing the line if necessary:
with open('file.txt') as f:
lines = f.readlines()
with open('file.txt', 'w+') as f:
for line in lines:
if line == 'CURRENT_BUILD = 111111\n':
f.write('CURRENT_BUILD = 221111\n')
else:
f.write(line)
CURRENT_BUILD = '111111'
print (CURRENT_BUILD.replace('111111', '221111'))
The syntax of replace() is:
str.replace(old, new [, count])
old - old substring you want to replace
new - new substring which would replace the old substring
count (optional) - the number of times you want to replace the old substring with the new substring
If count is not specified, replace() method replaces all occurrences of the old substring with the new substring.
I'm not sure if this is what you meant as you are unclear and haven't shown what the .txt file is but this is replace anyways.
EDIT
if it is in text replace you want then this would be your best bet.
import fileinput
with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
for line in file:
print(line.replace(text_to_search, replacement_text), end='')
credit to #jfs from this post How to search and replace text in a file using Python?
Sequence 1.1.1 ATGCGCGCGATAAGGCGCTA
ATATTATAGCGCGCGCGCGGATATATATATATATATATATT
Sequence 1.2.2 ATATGCGCGCGCGCGCGGCG
ACCCCGCGCGCGCGCGGCGCGATATATATATATATATATATT
Sequence 2.1.1 ATTCGCGCGAGTATAGCGGCG
NOW,I would like to remove the last digit from each of the line that starts with '>'. For example, in this first line, i would like to remove '.1' (rightmost) and in second instance i would like to remove '.2' and then write the rest of the file to a new file. Thanks,
import fileinput
import re
for line in fileinput.input(inplace=True, backup='.bak'):
line = line.rstrip()
if line.startswith('>'):
line = re.sub(r'\.\d$', '', line)
print line
many details can be changed depending on details of the processing you want, which you have not clearly communicated, but this is the general idea.
import re
trimmedtext = re.sub(r'(\d+\.\d+)\.\d', '$1', text)
Should do it. Somewhat simpler than searching for start characters (and it won't effect your DNA chains)
if line.startswith('>Sequence'):
line = line[:-2] # trim 2 characters from the end of the string
or if there could be more than one digit after the period:
if line.startswith('>Sequence'):
dot_pos = line.rfind('.') # find position of rightmost period
line = line[:dot_pos] # truncate upto but not including the dot
Edit for if the sequence occurs on the same line as >Sequence
If we know that there will always be only 1 digit to remove we can cut out the period and the digit with:
line = line[:13] + line[15:]
This is using a feature of Python called slices. The indexes are zero-based and exclusive for the end of the range so line[0:13] will give us the first 13 characters of line. Except that if we want to start at the beginning the 0 is optional so line[:13] does the same thing. Similarly line[15:] gives us the substring starting at character 15 to the end of the string.
map "".join(line.split('.')[:-1]) to each line of the file.
Here's a short script. Run it like: script [filename to clean]. Lots of error handling omitted.
It operates using generators, so it should work fine on huge files as well.
import sys
import os
def clean_line(line):
if line.startswith(">"):
return line.rstrip()[:-2]
else:
return line.rstrip()
def clean(input):
for line in input:
yield clean_line(line)
if __name__ == "__main__":
filename = sys.argv[1]
print "Cleaning %s; output to %s.." % (filename, filename + ".clean")
input = None
output = None
try:
input = open(filename, "r")
output = open(filename + ".clean", "w")
for line in clean(input):
output.write(line + os.linesep)
print ": " + line
except:
input.close()
if output != None:
output.close()
import re
input_file = open('in')
output_file = open('out', 'w')
for line in input_file:
line = re.sub(r'(\d+[.]\d+)[.]\d+', r'\1', line)
output_file.write(line)