change text in text file in particular manner - python

I have a text file and there are some lines called
moves={123:abc:567:mno}
No I want to convert it in the form
moves={123:abc, 567:mno}
i.e. i want to replace : with , when ':' is after a string and behind a number i.e i want to make it like in a python dictionary format i know how i can change a particular line in the text file like
with open("images/filename.txt", 'r+', encoding='Latin1') as fp:
# read an store all lines into list
lines = fp.readlines()
# move file pointer to the beginning of a file
fp.seek(0)
# truncate the file
fp.truncate()
for line in lines:
if line.startswith('moves='):
do something()
fp.writelines(lines)
I am not being able to figure out how should i replace the line and make it in my desired why. Please don't delete or close the question tell me how i should edit the questionin the comments so i can change it instead.
Thanks in advance

If you have a line in your file (as a string) and it's exactly in that format then you can split the string on the colon separators then re-join with a comma in the appropriate place. For example:
s = 'moves={123:abc:567:mno}'
t = s.split(':')
print(':'.join(t[:2])+','+':'.join(t[2:]))
Output:
moves={123:abc,567:mno}

s = 'moves={123:abc:567:mno}'
t = s.split(':')
print(t[2:5])
# print(':'.join(t[:2])+','+':'.join(t[2:]))
output = ''
for i in range(int(len(t)/2)):
output += ':'.join(t[i*2:i*2 + 2]) + ', '
output = output.removesuffix(', ')
print(output)
Thanks #Lancelot du Lac to help me improvising your answer I have got the answer to my question

Related

read keyword in txt file, and print add text + keyword

I got many keywords in txt file to python using f = open().
And I want to add text before each keywords.
example,
(http://www.google.com/) + (abcdefg)
add text keywords imported
It have tried it, I can't result I want.
f = open("C:/abc/abc.txt", 'r')
data = f.read()
print("http://www.google.com/" + data)
f.close()
I tried it using "for".
But, I can't it.
Please let me know the solution.
many thanks.
Your original code has some flaws:
you only read the first line of the file, with data = f.read(). If you want to read all the lines from the file, use a for;
data is a str-type variable, which may have more than one word. Thus, you must split this line into words, using data.split()
To solve your problem, you need to read each line from the file, split the line into the words it has, then loop through the list with the words, add the desired text then the word itself.
The correct program is this:
f = open("C:/abc/abc.txt", 'r')
for data in f:
words = data.split()
for i in words:
print("http://www.google.com/" + i)
f.close()
with open('text.txt','r') as f:
for line in f:
print("http://www.google.com/" + line)

extract the dimensions from the head lines of text file

Please see following attached image showing the format of the text file. I need to extract the dimensions of data matrix indicated by the first line in the file, here 49 * 70 * 1 for the case shown by the image. Note that the length of name "gd_fac" can be varying. How can I extract these numbers as integers? I am using Python 3.6.
Specification is not very clear. I am assuming that the information you want will always be in the first line, and always be in parenthesis. After that:
with open(filename) as infile:
line = infile.readline()
string = line[line.find('(')+1:line.find(')')]
lst = string.split('x')
This will create the list lst = [49, 70, 1].
What is happening here:
First I open the file (you will need to replace filename with the name of your file, as a string. The with ... as ... structure ensures that the file is closed after use. Then I read the first line. After that. I select only the parts of that line that fall after the open paren (, and before the close paren ). Finally, I break the string into parts, with the character x as the separator. This creates a list that contains the values in the first line of the file, which fall between parenthesis, and are separated by x.
Since you have mentioned that length of 'gd_fac' van be variable, best solution will be using Regular Expression.
import re
with open("a.txt") as fh:
for line in fh:
if '(' in line and ')' in line:
dimension = re.findall(r'.*\((.*)\)',line)[0]
break
print dimension
Output:
'49x70x1'
What this does is it looks for "gd_fac"
then if it's there is removes all the unneeded stuff and replaces it with just what you want.
with open('test.txt', 'r') as infile:
for line in infile:
if("gd_fac" in line):
line = line.replace("gd_fac", "")
line = line.replace("x", "*")
line = line.replace("(","")
line = line.replace(")","")
print (line)
break
OUTPUT: "49x70x1"

Not able to frame text while adding a line to middle of file in python

My text.txt looks like this
abcd
xyzv
dead-hosts
-abcd.srini.com
-asdsfcd.srini.com
And I want to insert few lines after "dead-hosts" line, I made a script to add lines to file, there is extra space before last line, that's mandatory in my file, but post added new lines that space got removed, dont know how to maintain the space as it is.
Here is my script
Failvrlist = ['srini.com','srini1.com']
tmplst = []
with open(‘test.txt’,'r+') as fd:
for line in fd:
tmplst.append(line.strip())
pos = tmplst.index('dead-hosts:')
tmplst.insert(pos+1,"#extra comment ")
for i in range(len(Failvrlist)):
tmplst.insert(pos+2+i," - "+Failvrlist[i])
tmplst.insert(pos+len(Failvrlist)+2,"\n")
for i in xrange(len(tmplst)):
fd.write("%s\n" %(tmplst[i]))
output is as below
abcd
xyzv
dead-hosts
#extra comment
- srini.com
- srini1.com
- abcd.srini.com
- asdsfcd.srini.com
if you look at the last two lines the space got removed, please advise .
Points:
In you code , pos = tmplst.index('dead-hosts:'), you are trying to find dead-hosts:. However, input file you have given has only "dead hosts". No colon after dead-hosts, I am considering dead-hosts:
While reading file first time into list, use rstrip() instead of strip(). Using rstrip() will keep spaces at the start of line as it is.
Once you read file into list, code after that should be outside with block which is use to open and read file.
Actually, flow of code should be
Open file and read lines to list and close the file.
Modify list by inserting values at specific index.
Write the file again.
Code:
Failvrlist = ['srini.com','srini1.com']
tmplst = []
#Open file and read it
with open('result.txt','r+') as fd:
for line in fd:
tmplst.append(line.rstrip())
#Modify list
pos = tmplst.index('dead-hosts:')
tmplst.insert(pos+1,"#extra comment")
pos = tmplst.index('#extra comment')
a = 1
for i in Failvrlist:
to_add = " -" + i
tmplst.insert(pos+a,to_add)
a+=1
#Write to file
with open('result.txt','w') as fd:
for i in range(len(tmplst)):
fd.write("%s\n" %(tmplst[i]))
Content of result.txt:
abcd
xyzv
dead-hosts:
#extra comment
-srini.com
-srini1.com
-abcd.srini.com
-asdsfcd.srini.com

Add a character to a string

I am newbie with python and I have one problem with a small script I hope someone can give me a clue.
I have a file called "one.txt" which has the following 2 lines:
Hello
Goodbye
I want to add two characters ("/1") to the end of each line and write it in another file called result.txt:
result.txt
Hello1/
Goodbye1/
I tried the following:
x=open("one.txt","r")
y=open("result.txt","w")
for line in x:
line2= "/1" +line
y.write(line2)
and I get:
1/Hello
1/Goodbye
if I change line2 with:
line2= line + "/1"
I get:
Hello
/1Goodbye
/1
which is also not correct
any clues?
You forgot to strip the newline after reading the line and to add it back in before writing.
Here's yet another version, using context managers for the files (so you don't forget to close them later) - otherwise it's similar to the answer by #IgorPomaranskiy:
with open("one.txt") as x, open("result.txt", "w") as y:
for line in x:
y.write("{}\n".format(line.strip() + "/1"))
x = open("one.txt", "r")
y = open("result.txt", "w")
for line in x:
y.write("{}/1\n".format(line.strip())
When you read a line from a file, the string contains the newline character that indicated the end of the line. Your string isn't "Hello", it's "Hello\n". You need to remove that newline, create your output string, and add another newline when you write it back out.
for line in x:
line = line.rstrip('\n')
line2 = line + '/1\n'
y.write(line2)

.split() creating a blank line in python3

I am trying to convert a 'fastq' file in to a tab-delimited file using python3.
Here is the input: (line 1-4 is one record that i require to print as tab separated format). Here, I am trying to read in each record in to a list object:
#SEQ_ID
GATTTGGGGTT
+
!''*((((***
#SEQ_ID
GATTTGGGGTT
+
!''*((((***
using this:
data = open('sample3.fq')
fq_record = data.read().replace('#', ',#').split(',')
for item in fq_record:
print(item.replace('\n', '\t').split('\t'))
Output is:
['']
['#SEQ_ID', 'GATTTGGGGTT', '+', "!''*((((***", '']
['#SEQ_ID', 'GATTTGGGGTT', '+', "!''*((((***", '', '']
I am geting a blank line at the begining of the output, which I do not understand why ??
I am aware that this can be done in so many other ways but I need to figure out the reason as I am learning python.
Thanks
When you replace # with ,#, you put a comma at the beginning of the string (since it starts with #). Then when you split on commas, there is nothing before the first comma, so this gives you an empty string in the split. What happens is basically like this:
>>> print ',x'.split(',')
['', 'x']
If you know your data always begins with #, you can just skip the empty record in your loop. Just do for item in fq_record[1:].
You can also go line-by-line without all the replacing:
fobj = io.StringIO("""#SEQ_ID
GATTTGGGGTT
+
!''*((((***
#SEQ_ID
GATTTGGGGTT
+
!''*((((***""")
data = []
entry = []
for raw_line in fobj:
line = raw_line.strip()
if line.startswith('#'):
if entry:
data.append(entry)
entry = []
entry.append(line)
data.append(entry)
data looks like this:
[['#SEQ_ID', 'GATTTGGGGTTy', '+', "!''*((((***"],
['#SEQ_ID', 'GATTTGGGGTTx', '+', "!''*((((***"]]
Thank you all for your answers. As a beginner, my main problem was the occurrence of a blank line upon .split(',') which I have now understood conceptually. So my first useful program in python is here:
# this script converts a .fastq file in to .fasta format
import sys
# Usage statement:
print('\nUsage: fq2fasta.py input-file output-file\n=========================================\n\n')
# define a function for fasta formating
def format_fasta(name, sequence):
fasta_string = '>' + name + "\n" + sequence + '\n'
return fasta_string
# open the file for reading
data = open(sys.argv[1])
# open the file for writing
fasta = open(sys.argv[2], 'wt')
# feed all fastq records in to a list
fq_records = data.read().replace('#', ',#').split(',')
# iterate through list objects
for item in fq_records[1:]: # this is to avoid the first line which is created as blank by .split() function
line = item.replace('\n', '\t').split('\t')
name = line[0]
sequence = line[1]
fasta.write(format_fasta(name, sequence))
fasta.close()
Other things suggested in the answers would be more clear to me as I learn more.
Thanks again.

Categories