Remove Space from Particular line In Textfile Python - python

I have textfiles that have the date stored on line 7 of each file, formatted as such:
Date: 1233PM 14 MAY 00
I would like to search through each file and get the new line 7 to be formatted as such:
Date: 1233PM 14 MAY 2000
So, basically, I just need to stick a '20' in front of the last two digits in line seven.
Probably not the most difficult problem, but I have been having difficulty as textfile.readlines() reads everything into the first (textfile[0]) position.

You can read all the file, change the specified line then save it again:
arc = open('file_name.txt').readlines()[0].split('\r')
#Do what you want with the 7th line i.e. arc[6]
new_arc = open('file_name.txt','w')
for line in arc:
new_arc.write(line)
new_arc.write('\n')
new_arc.close()

Maybe this:
with open(filename, 'r') as f:
lines = f.readlines()
with open(filename, 'w') as f:
for idx, line in lines:
if idx == 7: # or 6
vals = line.split()
if len(vals[-1]) == 2:
vals[-1] = '20'+vals[-1]
line = ' '.join(vals)
f.write(line)

Try this:
# open file
f = open("file.txt" , 'rU+b')
lines = f.readlines()
# modify line 7
lines[6] = lines[6][:-2] + "20" + lines[6][-2:]
# return file pointer to the top so we can rewrite the file
f.seek(0)
f.truncate()
# write the file with new content
f.write(''.join(lines))
f.close

Related

How to get first letter of each line in python?

here is what I got txt and open
txt file looks like
f = open('data.txt', 'r')
print(f.read())
the show['Cat\n','Dog\n','Cat\n','Dog\n'........]
output
But I would like to get this
['C\n','D\n','C\n','D\n'........]
First you'll want to open the file in read mode (r flag in open), then you can iterate through the file object with a for loop to read each line one at a time. Lastly, you want to access the first element of each line at index 0 to get the first letter.
first_letters = []
with open('data.txt', 'r') as f:
for line in f:
first_letters.append(line[0])
print(first_letters)
If you want to have the newline character still present in the string you can modify line 5 from above to:
first_letters.append(line[0] + '\n')
f = open("data.txt", "r")
for x in f:
print(x[0])
f.close()

Python function to multiple individual lines in text file

I am trying to write a function that can take every individual line in a txt file and multiply that line by 2 so that each integer in the text file is doubled. So far I was able to get the code to print. However, when I added the code (reading & reading_int) to convert the strings to integers the function is now not working. There are no errors in the code to tell me what I am doing wrong. I am not sure what is wrong with reading and reading_int that is making my function not work.
def mult_num3():
data=[]
w = open('file3.txt', 'r')
with w as f:
reading = f.read()
reading_int = [int(x) for x in reading.split()]
for line in f:
currentline = line[:-1]
data.append(currentline)
for i in data:
w.write(int(i)*2)
w.close()
file3.txt:
1
2
3
4
5
6
7
8
9
10
Desired output:
2
4
6
8
10
12
14
16
18
20
Problems with original code:
def mult_num3():
data=[]
w = open('file3.txt', 'r') # only opened for reading, not writing
with w as f:
reading = f.read() # reads whole file
reading_int = [int(x) for x in reading.split()] # unused variable
for line in f: # file is empty now
currentline = line[:-1] # not executed
data.append(currentline) # not executed
for i in data: # data is empty, so...
w.write(int(i)*2) # not executed, can't write an int if it did
# and file isn't writable.
w.close() # not necessary, 'with' will close it
Note that int() ignores leading and trailing whitespace so no need for .split() if only one number per line, and a format string (f-string) can format each line as needed by converting and doubling the value and adding a newline.
with open('file3.txt', 'r') as f:
data = [f'{int(line)*2}\n' for line in f]
with open('file3.txt', 'w') as f:
f.writelines(data)
I added a try except to check for not integer data. I dont konw your data. But maybe it helps you.
def mult_num3():
input = open('file3.txt', 'r')
output = open('script_out.txt', 'w')
with input as f:
for line in f:
for value in line.split():
try:
output.write(str(int(value) * 2) + " ")
except:
output.write(
"(" + str(value + ": is not an integer") + ") ")
output.write("\n")
output.close()

take a string from a file in which it occupies multiple lines

I have a problem with a python program. In this program I have to take strings from a file and save it to a list. The problem is that in this file some strings occupy more lines.
The file named 'ft1.txt' is structured like this:
'''
home
wo
rk
''''
sec
urity
'''
inform
atio
n
'''
Consequently opening the file and doing f.read () I get out:
" \n\nhome\nwo\nrk\n\nsec\nurity\n\ninform\nation\nn ".
I execute the following code:
with open('ft1.txt', 'r') as f: #i open file
list_strin = f.read().split('\n\n') #save string in list
I want output [homework, security, information].
But the actual output is [home\nwo\nrk, sec\nurity, inform\nation\nn]
How can I remove the special character "\n" in individual strings and merge them correctly?
You have \n in string. Remove it :-)
list_strin = [x.replace('\n', '') for x in f.read().strip().split('\n\n')]
readline solution:
res = []
s = ''
with open('ft1.txt', 'r') as f:
line = f.readline()
while line:
line = line.strip()
if line == '':
if s:
res.append(s)
s = ''
else:
s += line
line = f.readline()
print(res)

Changing a text file and making a bigger text file in python

I have a tab separated text file like these example:
infile:
chr1 + 1071396 1271396 LOC
chr12 + 1101483 1121483 MIR200B
I want to divide the difference between columns 3 and 4 in infile into 100 and make 100 rows per row in infile and make a new file named newfile
and make the final tab separated file with 6 columns. The first 5 columns would be like infile, the 6th column would be (5th column)_part number (number is 1 to 100).
This is the expected output file:
expected output:
chr1 + 1071396 1073396 LOC LOC_part1
chr1 + 1073396 1075396 LOC LOC_part2
.
.
.
chr1 + 1269396 1271396 LOC LOC_part100
chr12 + 1101483 1101683 MIR200B MIR200B_part1
chr12 + 1101683 1101883 MIR200B MIR200B_part2
.
.
.
chr12 + 1121283 1121483 MIR200B MIR200B_part100
I wrote the following code to get the expected output but it does not return what I expect.
file = open('infile.txt', 'rb')
cont = []
for line in file:
cont.append(line)
newfile = []
for i in cont:
percent = (i[3]-i[2])/100
for j in percent:
newfile.append(i[0], i[1], i[2], i[2]+percent, i[4], i[4]_'part'percent[j])
with open('output.txt', 'w') as f:
for i in newfile:
for j in i:
f.write(i + '\n')
Do you know how to fix the problem?
Try this:
file = open('infile.txt', 'rb')
cont = []
for line in file:
cont.append(list(filter(lambda x: not x.isspace(), line.split(' ')))
newfile = []
for i in cont:
diff= (int(i[3])-int(i[2]))/100
left = i[2]
right = i[2] + diff
for j in range(100):
newfile.append(i[0], i[1], left, right, i[4], i[4]_'part' + j)
left = right
right = right + diff
with open('output.txt', 'w') as f:
for i in newfile:
for j in i:
f.write(i + '\n')
In your code for i in cont youre loop over the string and i is a char and not string.
To fix that i split the line and remove spaces.
Here are some suggestions:
when you open the file, open it as a text file, not a binary file.
open('infile.txt','r')
now, when you read it line by line, you should strip the newline character at the end by using strip(). Then, you need to split your input text line by tabs into a list of strings, vs a just a long string containing your line, by using split('\t'):
line.strip().split('\t')
now you have:
file = open('infile.txt', 'r')
cont = []
for line in file:
cont.append(line.strip().split('\t))
now cont is a list of lists, where each list contains your tab separated data. i.e.
cont[1][0] = 'chr12'.
You will probably able to take it from here.
Others have answered your question with respect to your own code, I thought I would leave my attempt at solving your problem here.
import os
directory = "C:/Users/DELL/Desktop/"
filename = "infile.txt"
path = os.path.join(directory, filename)
with open(path, "r") as f_in, open(directory+"outfile.txt", "w") as f_out: #open input and output files
for line in f_in:
contents = line.rstrip().split("\t") #split line into words stored as a string 'contents'
diff = (int(contents[3]) - int(contents[2]))/100
for i in range(100):
temp = (f"{contents[0]}\t+\t{int(int(contents[2])+ diff*i)}\t{contents[3]}\t{contents[4]}\t{contents[4]}_part{i+1}")
f_out.write(temp+"\n")
This code doesn't follow python style convention well (excessively long lines, for example) but it works. The line temp = ... uses fstrings to format the output string conveniently, which you could read more about here.

How do append characters on either side of certain words on every line in a file in python?

Say I have a file my_file, and I want to search for a certain word x on every line of the file, and if the word exists, attach my variable y to the left and right side of the word. Then I want replace the old line with the new, modified line in my_new_file. How do I do this? So far I have:
output = open(omy_new_file, "w")
for line in open(my_file):
if (" " + x + "") in line:
You can try this:
y = "someword"
x = "target_string"
lines = [i.strip('\n') for i in open('filename.txt')]
final_lines = ["{}{}{}".format(y, i, y) if x in i else i for i in lines]
f = open(omy_new_file, "w")
for i in final_lines:
f.write("{}\n".format(i))
f.close()
with open('inputfile.txt', 'r') as infile:
with open('outfile.txt', 'w') as outfile:
for line in infile.readlines():
outfile.write(line.replace('string', y + 'string' + y)
Try This:
with open("my_file", "r") as my_file:
raw_data = my_file.read()
# READ YOUR FILE
new_data = raw_data.split("\n")
for line in new_data:
if "sd" in line:
my_new_line = "y" + line + "y"
raw_data = raw_data.replace(line, my_new_line)
print(raw_data)
It's tough to replace a line in a file while reading it, for the same reason that it's tough to safely modify a list as you iterate over it.
It's much better to read through the file, collect a list of lines, then overwrite the original. If the file is particularly large (such that it would be infeasible to hold it all in memory at once), you can write to disk twice.
import tempfile
y = "***"
your_word = "Whatever you're filtering by"
with tempfile.TemporaryFile(mode="w+") as tmpf:
with open(my_file, 'r') as f:
for line in f:
if your_word in line:
line = f"{y}{line.strip()}{y}\n"
tmpf.write(line) # write to the temp file
tmpf.seek(0) # move back to the beginning of the tempfile
with open(my_file, 'w') as f:
for line in tmpf: # reading from tempfile now
my_file.write(line)

Categories