How to create a text file in python without csv.writer? - python

I need a comma seperated txt file with txt extension.
"a,b,c"
I used csv.writer to create a csv file changed the extension. Another prog would not use/process the data. I tried "wb", "w."
F = open(Fn, 'w')
w = csv.writer(F)
w.writerow(sym)
F.close()
opened with notepad ---These are the complete files.
Their file: created using their gui used three symbols
PDCO,ICUI,DVA
my file : created using python
PDCO,ICUI,DVA
Tested: open thier file- worked, opened my file - failed.
Simple open and close with save in notepad. open my file-- worked
Works= 'PDCO,ICUI,DVA'
Fails= 'PDCO,ICUI,DVA\r\r\n'
Edit: writing txt file without Cvs writer.....
sym = ['MHS','MRK','AIG']
with open(r'C:\filename.txt', 'w') as F: # also try 'w'
for s in sym[:-1]: # separate all but the last
F.write(s + ',') # symbols with commas
F.write(sym[-1]) # end with the last symbol

To me, it look like you don't exactly know you third party application input format. If a .CSV isn't reconized, it might be something else.
Did you try to change the delimiter fromn ';' to ','
import csv
spamWriter = csv.writer(open('eggs.csv', 'wb'), delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])
spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
Take a look in the CSV Python API

I think the problem is your file write mode, as per CSV file written with Python has blank lines between each row
If you create your csv file like
csv.writer(open('myfile.csv', 'w'))
csv.writer ends its lines in '\r\n', and Python's text file handling (on Windows machines) then converts '\n' to '\r\n', resulting in lines ending in '\r\r\n'. Many programs will choke on this; Notepad recognizes it as a problem and strips the extra '\r' out.
If you use
csv.writer(open('myfile.csv', 'wb'))
it produces the expected '\r\n' line ending, which should work as desired.
Edit: #senderle has a good point; try the following:
goodf = open('file_that_works.txt', 'rb')
print repr(goodf.read(100))
badf = open('file_that_fails.txt', 'rb')
print repr(badf.read(100))
paste the results of that here, so we can see how the two compare byte-for-byte.

Try this:
with open('file_that_works.csv', 'rb') as testfile: # file is automatically
d = csv.Sniffer().sniff(testfile.read(1024)) # closed at end of with
# block
with open(Fn, 'wb') as F: # also try 'w'
w = csv.writer(F, dialect=d)
w.writerow(sym)
To explain further: this looks at a sample of a working .csv file and deduces its format. Then it uses that format to write a new .csv file that, hopefully, will not have to be resaved in notepad.
Edit: if the program you're using doesn't accept multi-line input (?!) then don't use csv. Just do something like this:
syms = ['JAGHS','GJKDGJ','GJDFAJ']
with open('filename.txt', 'wb') as F:
for s in syms[:-1]: # separate all but the last
F.write(s + ',') # symbols with commas
F.write(syms[-1]) # end with the last symbol
Or more tersely:
with open('filename.txt', 'wb') as F:
F.write(','.join(syms))
Also, check different file extensions (i.e. .txt, .csv, etc) to make sure that's not the problem. If this program chokes on a newline, then anything is possible.

So, I save as text file.
Now, create my own txt file with python.
What are the exact differences between their file and your file? Exact.

I suspect that #Hugh's comment is correct that it's an encoding issue.
When you do a Save As in notepad, what's selected in the Encoding dropdown? If you select different encodings do some or all of those fail to be opened by the 3rd party program?

Related

Can't open csv file in python without opening it in excel

I have a .csv file generated by a program. When I try to open it with the following code the output makes no sense, even though I have tried the same code with not program generated csv and it works fine.
g = 'datos/1.81/IR20211103_2275.csv'
f = open(g, "r", newline = "")
f = f.readlines()
print(f)
The output of the code looks like this
['ÿþA\x00l\x00l\x00 \x00t\x00e\x00m\x00p\x00e\x00r\x00a\x00t\x00u\x00r\x00e\x00s\x00 \x00i\x00n\x00 \x00°\x00F\x00.\x00\r',
'\x00\n',
'\x00\r',
'\x00\n',
'\x00D\x00:\x00\\\x00O\x00n\x00e\x00D\x00r\x00i\x00v\x00e\x00\\\x00M\x00A\x00E\x00S\x00T\x00R\x00I\x00A\x00 \x00I\x00M\x00E\x00C\x00\\\x00T\x00e\x00s\x00i\x00s\x00\\\x00d\x00a\x00t\x00o\x00s\x00\\\x001\x00.\x008\x001\x00\\\x00I\x00R\x002\x000\x002\x001\x001\x001\x000\x003\x00_\x002\x002\x007\x005\x00.\x00i\x00s\x002\x00\r',
However, when I first open the file with excel and save it as a .csv (replacing the original with the .csv from excel), the output is as expected, like this:
['All temperatures in °F.\r\n',
'\r\n',
'D:\\OneDrive\\MAESTRIA IMEC\\Tesis\\datos\\1.81\\IR20211103_2275.is2\r\n',
'\r\n',
'",1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,"\r\n',
I have also tried csv.reader() and doesn't work either.
Does anyone know what's going on and how can I solve it? How can I open my .csv without opening and saving it from excel first? The source program is SmartView from Fluke which reads thermal image file .is2 and converts it into a .csv file
Thank you very much
Your file is encoded with UTF-16 (Little Endian byte order). You can specify file encoding using encoding argument of open() function (list of standard encodings and their names you can find here).
Also I'd recommend to not use .readlines() as it will keep trailing newline chars. You can read all file content into as string (using .read()) and apply str.splitlines() to ... split string into a list of lines. Alternatively you can also consume file line by line and call str.rstrip() to cut trailing newline chars.
Final code:
filename = "datos/1.81/IR20211103_2275.csv"
with open(filename, encoding="utf16") as f:
lines = f.read().splitlines()
# OR
lines = [line.rstrip() for line in f]
g = 'datos/1.81/IR20211103_2275.csv'
f = open(g, "r", newline = "",encoding="utf-16")
f = f.readlines()
print(f)
try this one it may help

Delete empty row in XML File

When creating an XML file, it always creates blank lines for me.
This code looks like this:
for row in tbody.find_elements_by_xpath('./tr'):
itemsEmployee = row.find_elements_by_xpath('./td')
fileWriter.writerow([itemsEmployee[1].text, itemsEmployee[5].text, itemsEmployee[2].text, itemsEmployee[3].text,
itemsEmployee[4].text, itemsEmployee[6].text, itemsEmployee[7].text, itemsEmployee[8].text])
First of all, I don't know why I get blank lines. But anyway.
I now want to delete the empty lines and save the XML. (In a new file)
My attempt was as follows:
def deleteEmptyRowsInXML():
input = open('../data/employees_csv.csv', 'rb')
output = open('../data/employees.csv', 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
if row:
writer.writerow(row)
input.close()
os.remove('../data/employees_csv.csv')
output.close()
I would also like a solution in the same file.
Get the error:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
in this line:
for row in csv.reader(input):
A csv writer expects its underlying file to be opened with newline=''. The rationale is that RFC 4180 mandates that a csv file should have '\r\n' as end of line independently on which system it is generated. So the csv module explicitely adds the \r\n, but if you forgot newline='' you get an empty line for each row.
So it should be: output = open('../data/employees.csv', 'w', newline='')
The error message says that the file was probably not opened in text mode.
And in fact you opened it in binary mode : "rb" means "read file in binary mode". And "wb" means "write file in binary mode"
So change to this:
input = open('../data/employees_csv.csv', 'r')
output = open('../data/employees.csv', 'w')
But that's possible that you'll have other errors too. For the moment, I can't say more cause we don't have a reproducible example. but it will perhaps be enough to change the lines I pointed.

How to read a file and write it entirely to several text files using Python?

I want to load/read a text file and write it to two other text files "entirely". I will write other different data to the following of these two files later.
The problem is that the loaded file is only written to the first file, and no data from that loaded file is written to the second file.
The code I am using:
fin = open("File_Read", 'r')
fout1 = open("File_Write1", 'w')
fout2 = open("File_Write2", 'w')
fout1.write(fin.read())
fout2.write(fin.read()) #Nothing is written here!
fin.close()
fout1.close()
fout2.close()
What is happening and what is the solution?
I prefer using open instead of with open.
Thanks.
Apparently the fin.read() reads all the lines, the next fin.read() will continue from where the previous .read() ended (which is the last line). To solve this, I would simply go for:
text_fin = fin.read()
fout1.write(text_fin)
fout2.write(text_fin)
fin = open("test.txt", 'r')
data = fin.read()
fin.close()
fout1 = open("test2.txt", 'w')
fout1.write(data)
fout1.close()
fout2 = open("test3.txt", 'w')
fout2.write(data)
fout2.close()
N.B. with open is the safest and best way but at least you need to close the file as soon as there are not needed anymore.
You can try iterating through your original file line by line and appending it to both the files. You are running into the problem because file.write() method takes string argument.
fin = open("File_Read",'r')
fout1 = open("File_Write1",'a') #append permissions for line-by-line writing
fout2 = open("File_Write2",'a') #append permissions for line-by-line writing
for lines in fin:
fout1.write(lines)
fout2.write(lines)
fin.close()
fout1.close()
fout2.close()
*** NOTE: Not the most efficient solution.

CSV Writer truncates characters in sequence in Excel 2013

I have an interesting situation with Python's csv module. I have a function that takes specific lines from a text file and writes them to csv file:
import os
import csv
def csv_save_use(textfile, csvfile):
with open(textfile, "rb") as text:
for line in text:
line=line.strip()
with open(csvfile, "ab") as f:
if line.startswith("# Online_Resource"):
write = csv.writer(f, dialect='excel',
delimiter='\t',
lineterminator="\t",
)
write.writerow([line.lstrip("# ")])
if line.startswith("##"):
write = csv.writer(f, dialect='excel',
delimiter='\t',
lineterminator="\t",
)
write.writerow([line.lstrip("# ")])
Here is a sample of some strings from the original text file:
# Online_Resource: https://www.ncdc.noaa.gov/
## Corg% percent organic carbon,,,%,,paleoceanography,,,N
What is really bizarre is the final csv file looks good, except the characters in the first column only (those with the # originally) partially "overwrite" each other when I try to manually delete some characters from the cell:
Oddly enough, too, there seems to be no formula to how the characters get jumbled each time I try to delete some after running the script. I tried encoding the csv file as unicode to no avail.
Thanks.
You've selected excel dialect but you overrode it with weird parameters:
You're using TAB as separator and line terminator, which creates a 1-line CSV file. Close enough to "truncated" to me
Also quotechar shouldn't be a space.
This conveyed a nice side-effect as you noted: the csv module actually splits the lines according to commas!
The code is inefficient and error-prone: you're opening the file in append mode in the loop and create a new csv writer each time. Better done outside the loop.
Also, comma split must be done by hand now. So even better: use csv module to read the file as well. My fix proposal for your routine:
import os
import csv
def csv_save_use(textfile, csvfile):
with open(textfile, "rU") as text, open(csvfile, "wb") as f:
write = csv.writer(f, dialect='excel',
delimiter='\t')
reader = csv.reader(text, delimiter=",")
for row in reader:
if not row:
continue # skip possible empty rows
if row[0].startswith("# Online_Resource"):
write.writerow([row[0].lstrip("# ")])
elif row[0].startswith("##"):
write.writerow([row[0].lstrip("# ")]+row[1:]) # write row, stripping the first item from hashes
Note that the file isn't properly displayed in excel unless to remove delimiter='\t (reverts back to default comma)
Also note that you need to replace open(csvfile, "wb") as f by open(csvfile, "w",newline='') as f for Python 3.
here's how the output looks now (note that the empty cells are because there are several commas in a row)
more problems:
line=line.strip(" ") removes leading and trailing spaces. It doesn't remove \r or \n ... try line=line.strip() which removes leading and trailing whitespace
you get all your line including commas in one cell because you haven't split it up somehow ... like using a csv.reader instance. See here:
https://docs.python.org/2/library/csv.html#csv.reader
str.lstrip non-default arg is treated as a set of characters to be removed, so '## ' has the same effect as '# '. if guff.startswith('## ') then do guff = guff[3:] to get rid of the unwanted text
It is not very clear at all what the sentence containing "bizarre" means. We need to see exactly what is in the output csv file. Create a small test file with 3 records (1) with '# Online_Resource' (2) with "## " (3) none of the above, run your code, and show the output, like this:
print repr(open('testout.csv', 'rb').read())

Raw string for variables in python?

I have seen several similar posts on this but nothing has solved my problem.
I am reading a list of numbers with backslashes and writing them to a .csv. Obviously the backslashes are causing problems.
addr = "6253\342\200\2236387"
with open("output.csv", 'a') as w:
write = writer(w)
write.writerow([addr])
I found that using r"6253\342\200\2236387" gave me exactly what I want for the output but since I am reading my input from a file I can't use raw string. i tried .encode('string-escape') but that gave me 6253\xe2\x80\x936387 as output which is definitely not what I want. unicode-escape gave me an error. Any thoughts?
The r in front of a string is only for defining a string. If you're reading data from a file, it's already 'raw'. You shouldn't have to do anything special when reading in your data.
Note that if your data is not plain ascii, you may need to decode it or read it in binary. For example, if the data is utf-8, you can open the file like this before reading:
import codecs
f = codecs.open("test", "r", "utf-8")
Text file contains...
1234\4567\7890
41\5432\345\6789
Code:
with open('c:/tmp/numbers.csv', 'ab') as w:
f = open(textfilepath)
wr = csv.writer(w)
for line in f:
line = line.strip()
wr.writerow([line])
f.close()
This produced a csv with whole lines in a column. Maybe use 'ab' rather than 'a' as your file open type. I was getting extra blank records in my csv when using just 'a'.
I created this awhile back. This helps you write to a csv file.
def write2csv(fileName,theData):
theFile = open(fileName+'.csv', 'a')
wr = csv.writer(theFile, delimiter = ',', quoting=csv.QUOTE_MINIMAL)
wr.writerow(theData)

Categories