I am unable to handle case where user forgot to put newline.
I am trying to append new line to a csv file in python using below code -
with open(local_file_location, 'a') as file:
writer = csv.writer(file)
writer.writerow(row)
But in case if file do not have a new line it simply appends to same last line for the first time. And I am not sure how to handle that.
Ex-
Input-
1,2,3,4 <no-newline>
add row - {a,b,c,d}
Output-
1,2,3,4,a,b,c,d
but I want handle output in this event case to be as below -
1,2,3,4
a,b,c,d
Note: it should just append as well in case user file already have a new line.-> which current program does perfectly.
Let me know what can I do.
You can check if a file ends with a newline.
with open(local_file_location, 'r') as file:
data = file.read()
if data.endswith('\n'):
# some logic
Related
I am trying to create a .bed file after searching through DNA sequences for two regular expressions. Ideally, I'd like to generate a tab-separated file which contains the sequence description, the start location of the first regex and the end location of the second regex. I know that the regex section works, it's just creating the \t separated file I am struggling with.
I was hoping that I could open/create a file and simply print a new line for each iteration of the for loop that contains this information, like so:
with open("Mimp_hits.bed", "a+") as file_object:
for line in file_object:
print(f'{sequence.description}\t{h.start()}\t{h_rc.end()}')
file_object.close()
But this doesn't seem to work (creates empty file). I have also tried to use file_object.write, but again this creates an empty file too.
This is all of the code I have including searching for the regexes:
import re, sys
from Bio import SeqIO
from Bio.SeqRecord import SeqRecord
infile = sys.argv[1]
for sequence in SeqIO.parse(infile, "fasta"):
hit = re.finditer(r"CAGTGGG..GCAA[TA]AA", str(sequence.seq))
mimp_length = 400
for h in hit:
h_start = h.start()
hit_rc = re.finditer(r"TT[TA]TTGC..CCCACTG", str(sequence.seq))
for h_rc in hit_rc:
h_rc_end = h_rc.end()
length = h_rc_end - h_start
if length > 0:
if length < mimp_length:
with open("Mimp_hits.bed", "a+") as file_object:
for line in file_object:
print(sequence.description, h.start(), h_rc.end())
file_object.close()
This is the desired output:
Focub_II5_mimp_1__contig_1.16(656599:656809) 2 208
Focub_II5_mimp_2__contig_1.47(41315:41540) 2 223
Focub_II5_mimp_3__contig_1.65(13656:13882) 2 224
Focub_II5_mimp_4__contig_1.70(61591:61809) 2 216
This is example input:
>Focub_II5_mimp_1__contig_1.16(656599:656809)
TACAGTGGGATGCAAAAAGTATTCGCAGGTGTGTAGAGAGATTTGTTGCTCGGAAGCTAGTTAGGTGTAGCTTGTCAGGTTCTCAGTACCCTATATTACACCGAGATCAGCGGGATAATCTAGTCTCGAGTACATAAGCTAAGTTAAGCTACTAACTAGCGCAGCTGACACAACTTACACACCTGCAAATACTTTTTGCATCCCACTGTA
>Focub_II5_mimp_2__contig_1.47(41315:41540)
TACAGTGGGAGGCAATAAGTATGAATACCGGGCGTGTATTGTTTTCTGCCGCTAGCCCATTTTAACAGCTAGAGTGTGTATATTAACCTCACACATAGCTATCTCTTATACTAATTGGTTAGGGAAAACCTCTAACCAGGATTAGGAGTCAACATAGCTTGTTTTAGGCTAAGAGGTGTGTGTCAGTACACCAAAGGGTATTCATACTTATTGCCCCCCACTGTA
>Focub_II5_mimp_3__contig_1.65(13656:13882)
TACAGTGGGAGGCAATAAGTATGAATACCGGGCGTGTATTGTTTTTCTGCCGCTAGCCTATTTTAATAGTTAGAGTGTGCATATTAACCTCACACATAGCTATCTTATATACTAATCGGTTAGGGAAAACCTCTAACCAGGATTAGGAGTCAACATAGCTTCTTTTAGGCTAAGAGGTGTGTGTCAGTACACCAAAGGGTATTCATACTTATTGCCCCCCACTGTA
>Focub_II5_mimp_4__contig_1.70(61591:61809)
TACAGTGGGATGCAATAAGTTTGAATGCAGGCTGAAGTACCAGCTGTTGTAATCTAGCTCCTGTATACAACGCTTTAGCTTGATAAAGTAAGCGCTAAGCTGTATCAGGCAAAAGGCTATCCCGATTGGGGTATTGCTACGTAGGGAACTGGTCTTACCTTGGTTAGTCAGTGAATGTGTACTTGAGTTTGGATTCAAACTTATTGCATCCCACTGTA
Is anybody able to help?
Thank you :)
to write a line to a file you would do something like this:
with open("file.txt", "a") as f:
print("new line", file=f)
and if you want it tab separated you can also add sep="\t", this is why python 3 made print a function so you can use sep, end, file, and flush keyword arguments. :)
opening a file for appending means the file pointer starts at the end of the file which means that writing to it doesn't override any data (gets appended to the end of the file) and iterating over it (or otherwise reading from it) gives nothing like you already reached the end of the file.
So instead of iterating over the lines of the file you would just write the single line to it:
with open("Mimp_hits.bed", "a") as file_object:
print(sequence.description, h.start(), h_rc.end(), file=file_object)
you can also consider just opening the file near the beginning of the loop since opening it once and writing multiple times is more efficient than opening it multiple times, also the with block automatically closes the file so no need to do that explicitly.
You are trying to open the file in "a+" mode, and loop over lines from it (which will not find anything because the file is positioned at the end when you do that). In any case, if this is an output file only, then you would open it in "a" mode to append to it.
Probably you just want to open the file once for appending, and inside the with statement, do your main loop, using file_object.write(...) when you want to actually append strings to the file. Note that there is no need for file_object.close() when using this with construct.
with open("Mimp_hits.bed", "a") as file_object:
for sequence in SeqIO.parse(infile, "fasta"):
# ... etc per original code ...
if length < mimp_length:
file_object.write("{}\t{}\t{}\n".format(
sequence.description, h.start(), h_rc.end()))
Edited because it seems as though I was too vague or didn't show enough research. My apologies (newbie here).
I am trying to read a csv file and assign each new line as a value to iterate through a script that writes to an API. There's no header data in my csv. I'll be adding a regex search and then using the data that follows the regex expression and assign it as a variable to iterate through my script if that makes sense.
CSV Contents:
Type1, test.com
Type2, name.exe
Type3, sample.com
Basic premise of what I want to do in Python:
Read from CSV
Script runs with each line from the CSV as a variable (say Variable1).
The script iterates until it is out of values in the csv list, then terminates.
An example for the script syntax could be anything simple...
#!/usr/bin/python
import requests
import csv
reader = csv.reader(open('test.csv'))
for row in reader:
echo line-item
until the script runs out of Variables to print, then terminates. Where I'm struggling is the syntax on how to take a line then assign it to a variable for the for loop.
I hope that makes sense!
You should take a look at the csv module.
Here's how you would use it:
import csv
file = csv.reader(open('file.csv'), delimiter=',')
for line in file:
print(line)
This produces the following output:
['Type1', ' test.com']
['Type2', ' name.exe']
['Type3', ' sample.com']
It separates your lines into lists of strings at the occurrences of the delimiter you specify (a comma in this case).
If you want to read the file line by line (not as a CSV), you can just use:
with open('file.csv') as file:
for line in file:
print(line)
Using the with statement makes sure that the file is closed after we are done reading its contents.
I have two programs in Python. One writes a customer's information to a CSV. The other accesses it. When the first has written it, I can open the CSV file (in Excel) and see that it has been written correctly. However for the other program to access the new data in the CSV file I have to manually open it and save it (in Excel) otherwise it doesn't work. Does anyone know why this may be?
Edit:
This writes to it (from first program):
f = open('details.csv', 'at', newline=''); csv_f = csv.reader(f)
csv_w.writerow(clientList)
f.close()
And this reads it (second program):
f = open('details.csv', 'rt', newline=''); csv_f = csv.reader(f)
for row in csv_f:
name.append(row[0])
I get this error when trying to append row[0] to a list.
Traceback (most recent call last):
File "C:\Users\Dan\Desktop\Garden Centre\work.py", line 8, in <module>
name.append(row[0])
IndexError: list index out of range
I have seen a number of such problems stemming from differnt platform line endings. Under Python 2, you might try the "universal line endings" file reading mode:
with open('data.csv', 'rU') as f:
for row in csv.reader(f):
print row
Because Excel does often use the old Mac (\r) and Windows (\r\n) standards, which can get in the csv module's way, esp. on a Unix or Mac platform where Python expects the Unix standard line ending (\n). Python 3 is generally smarter about this (and other file/string encoding issues), so generally doesn't need a special mode.
I found an answer after hours of trying. In Excel, each item in an 'empty' row contains '' for the largest number of items in any row. Python doesn't write it like that to the CSV and instead only one item on an empty row contains None. As it iterated through each row, there was no first item to add to the list on the empty rows.
I had to manually add an extra '' to the list that is wrote by the first program.
I've noticed a really weird bug and didn't know if anyone else had seen this / knows how to stop it.
I'm writing to a CSV file using this:
def write_to_csv_file(self, object, string):
with open('data_model_1.csv', 'a') as f:
writer = csv.writer(f)
writer.writerow([object, string])
and then write to the file:
self.write_to_csv_file(self.result['outputLabel'], string)
If I open the CSV file to look at the results, the next time I write to the file, it will start in column 3 of the last line (column 1 is object, column 2 is string).
If I run self.write_to_csv_file(self.result['outputLabel'], string) multiple times without manually opening the file (obviously I open the file in the Python script), everything is fine.
It's only when I open the file so I get the issue of starting on Column 3.
Any thoughts on how to fix this?
You're opening the file in append mode, so the data is appended to the end of the file. If the file doesn't end in a newline, rows may get concatenated. Try writing a newline to the file before appending new rows:
with open("data_model_1.csv", "a") as f:
f.write("\n")
I have a text file that contains key value pairs separated by a tab like this:
KEY\tVALUE
I have opened this file in append mode(a+) so I can both read and write. Now it may happen that a particular key has more than 1 value. For that I want to be able to go to that particular key and write the next value beside original one separated by a some delimiter(or ,).
Here is what I wish to do:
import io
ft = io.open("test.txt",'a+')
ft.seek(0)
for line in ft:
if (line.split('\t')[0] == "querykey"):
ft.write(unicode("nextvalue"));#Write the another key value beside the original one
Now there are two problems with it:
I will iterate through the file to see on which line the key is present(Is there a faster way?)
I will write a string to the end of that line.
I would be grateful if I can get help with the second point.
The write function always writes at the end of file. How should I write to the end of a specific line? I have searched and have not got very clear answers as to how to do that
You can read whole of file content, do your edit and write edited content to file.
with open('test.txt') as f:
lines = f.readlines()
f= open('test.txt', 'w')#open file for write
for line in lines:
if line.split('\t')[0] == "querykey":
line = line + ',newkey'
f.write('\n'.join(lines))