I spent all day trying to find a solution for this problem and as I didn't find one yet I'm coming here for you help!
I'm writing a CSV file exporter on python and I can't get it to output exactly how I want it to
let's say I have a list of strings containing several tutorial titles:
tutorials = ['tut1', 'tut2, 'tut3']
and I want to print in the csv file's first line like this:
First Name, Surname, StudentID, tut1, tut2, tut3
Now, this is the code I wrote to do so:
tutStr= ' '
for i tutorials:
tutStr += i + ', '
tutStr = tutStr[0:-3]
with open('CSVexport.csv', 'wb') as csvFile:
writer = csv.writer(csvFile, delimiter=',', quoting=csv.QUOTE_NONE)
writer.writerow(['First Name'+'Surname'+'ID number'+tutStr)
This gives me the error:
"writer.writerow(['First Name'+'Surname'+'ID number'+labString])
Error: need to escape, but no escapechar set"
now, I know that the error has to do with csv.QUOTE_NONE and that I have to set an escapechar but I really don't understand how am I supposed to do so.
if I try without the QUOTE_NONE it'll automaticalle set the quotechar to ' " ' which will give me this output
"First Name, Surname, StudentID, tut1, tut2, tut3" --> don't want quotation marks!!!
Any idea?
If you are going to write something with writerow you need to pass a list there, not a list containing one big string separated by commas (it will appear as a string in one cell). quoting=csv.QUOTE_MINIMAL would help to quote strings that need to be quoted.
tutorials = ['tut1', 'tut2', 'tut3']
headers = ['First Name', 'Surname', 'ID number'] + tutorials
first_line = True
with open('CSVexport.csv', 'w') as csvFile:
writer = csv.writer(csvFile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
if first_line:
writer.writerow(headers)
first_line = False
Related
I have a delimited file in which some of the fields contain line termination characters. They can be LF or CR/LF.
The line terminators cause the records to split over multiple lines.
My objective is to read the file, remove the line termination characters, then write out a delimited file with quotes around the fields.
Sample input record:
444,2018-04-06,19:43:47,43762485,"Request processed"CR\LF
555,2018-04-30,19:17:56,43762485,"Added further note:LF
email customer a receipt" CR\LF
The first record is fine but the second has a LF (line feed) causing the record to fold.
import csv
with open(raw_data, 'r', newline='') as inp, open(csv_data, 'w') as out:
csvreader = csv.reader(inp, delimiter=',', quotechar='"')
for row in csvreader:
print(str(row))
out.write(str(row)[1:-1] + '\n')
My code nearly works but I don’t think it is correct.
The output I get is:
['444', '2020-04-06', '19:43:47', '344376882485', 'Request processed']
['555', '2020-04-30', '19:17:56', '344376882485', 'Added further note:\nemail customer a receipt']
I use the substring to remove the square brackets at the start and end of the line which I think is not the correct way.
Notice on the second record the new line character has been converted to \n. I would like to know how to get rid of that and also incorporate a csv writer to the code to place double quoting around the fields.
To remove the line terminators I tried replace but did not work.
(row.replace('\r', '').replace('\n', '') for row in csvreader)
I also tried to incorporate a csv writer but could not get it working with the list.
Any advice would be appreciated.
This snippet does what you want:
with open('raw_data.csv', 'r', newline='') as inp, open('csv_data.csv', 'w') as out:
reader = csv.reader(inp, delimiter=',', quotechar='"')
writer = csv.writer(out, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
for row in reader:
fixed = [cell.replace('\n', '') for cell in row]
writer.writerow(fixed)
Quoting all cells is handled by passing csv.QUOTE_ALL as the writer's "quoting" argument.
The line
fixed = [cell.replace('\n', '') for cell in row]
creates a new list of cells where embedded '\n' characters are replaced by the empty string.
By default, Python will set the end-of-line to your platform's default. If you want to override this you can pass a lineterminator argument to the writer.
To me the original csv seems fine: it's normal to have embedded newlines ("soft line-breaks") inside quoted cells, and csv-aware applications should as spreadsheets will handle them correctly. However they will look wrong in applications that don't understand csv formatting and so treat the embedded newlines as actual end of line characters.
I want to remove quoting all together and when I write to my csv file I'm getting an extra \ between the name and ip.
with open('test.csv', 'w') as csvfile:
info = csv.writer(csvfile, quoting=csv.QUOTE_NONE, delimiter=',', escapechar='\\')
for json_object in json_data:
if len(json_object['names']) != 0:
name = json_object['names'][0]
ip = json_object['ip_address']
combined = (name + ',' + ip)
print combined
info.writerow([combined])
this is what I'm seeing in my csv file:
ncs1.aol.net\,10.136.0.2
the formatting i'm trying to achieve is:
ncs1.aol.net,10.136.0.2
You don't need to write the row yourself, thats the point of using csv. Instead of creating combined just use:
info.writerow([name, ip])
Currently you're writing a single item to the row and that means the module is escaping , for you and you get \,.
You could also just strip your combined expression:
combine=combined.strip('\')
I've a large csv file(comma delimited). I would like to replace/rename few random cell with the value "NIL" to an empty string "".
I tried this to find the keyword "NIL" and replace with '' empty
string. But it's giving me an empty csv file
ifile = open('outfile', 'rb')
reader = csv.reader(ifile,delimiter='\t')
ofile = open('pp', 'wb')
writer = csv.writer(ofile, delimiter='\t')
findlist = ['NIL']
replacelist = [' ']
s = ifile.read()
for item, replacement in zip(findlist, replacelist):
s = s.replace(item, replacement)
ofile.write(s)
From seeing you code i fell you directly should
read the file
with open("test.csv") as opened_file:
data = opened_file.read()
then use regex to change all NIL to "" or " " and save back the data to the file.
import re
data = re.sub("NIL"," ",data) # this code will replace NIL with " " in the data string
NOTE: you can give any regex instead of NIL
for more info see re module.
EDIT 1: re.sub returns a new string so you need to return it to data.
A few tweaks and your example works. I edited your question to get rid of some indenting errors - assuming those were a cut/paste problem. The next problem is that you don't import csv ... but even though you create a reader and writer, you don't actually use them, so it could just be removed. So, opening in text instead of binary mode, we have
ifile = open('outfile') # 'outfile' is the input file...
ofile = open('pp', 'w')
findlist = ['NIL']
replacelist = [' ']
s = ifile.read()
for item, replacement in zip(findlist, replacelist):
s = s.replace(item, replacement)
ofile.write(s)
We could add 'with' clauses and use a dict to make replacements more clear
replace_this = { 'NIL': ' '}
with open('outfile') as ifile, open('pp', 'w') as ofile:
s = ifile.read()
for item, replacement in replace_this.items:
s = s.replace(item, replacement)
ofile.write(s)
The only real problem now is that it also changes things like "NILIST" to "IST". If this is a csv with all numbers except for "NIL", that's not a problem. But you could also use the csv module to only change cells that are exactly "NIL".
with open('outfile') as ifile, open('pp', 'w') as ofile:
reader = csv.reader(ifile)
writer = csv.writer(ofile)
for row in reader:
# row is a list of columns. The following builds a new list
# while checking and changing any column that is 'NIL'.
writer.writerow([c if c.strip() != 'NIL' else ' '
for c in row])
Im trying to write to a CSV-File, the output should look like this:
meist,L, ,meist (30), meisten (95)
But It looks like this, the joined strings get ":
meist,L, ,"meist (30), meisten (95)"
My code is the following one:
with open(dest_csv_file, 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=['wortstamm', 'words'])
writer.writeheader()
for wortstamm in self.ausgabe:
words = []
for word in self.ausgabe[wortstamm]["w"]:
words.append('' + word + ' (' + str(self.ausgabe[wortstamm]["w"][word]) + ')')
words_string = ', '.join(words)
writer.writerow({'wortstamm': wortstamm, 'words': words_string})
How do I get my code to write the desired output? What do I have to change? Thanks for your help!
If one of the fields you give to the csv writer contains a ,, it will be wrapped by quotes so that it will not be considered as multiple fields. If you have multiple fields, you should pass them in separately instead of joining them with a comma as in your code.
Since it seems that your fields come from a list, it may be easier to use csv.writer (which is given a list of fields for each row) instead of csv.DictWriter (which is given a dictionary where each field is specified by name).
Example:
writer = csv.writer(csvfile)
for wortstamm in self.ausgabe:
words = ....
writer.writerow([wortstamm] + words)
I have a code below that I use to get the lat and long values from a textfile when the header fields are separated by comma. But recently I had an instance where the header fields were separated by SPACE instead of comma. So when I ran this script below, it gave me an error. I am wondering if anyone knows how I can modify the script below so the header fields that are separated by SPACE can be parsed out.
inFile = "file Path"
gps_track = open(inFile, 'r')
csvReader = csv.reader(log)
header = csvReader.next()
latIndex = header.index("lat")
longIndex = header.index("long")
coordlist = []
for row in csvReader:
lat = row[latIndex]
long = row[longIndex]
coordlist.append([lat,long])
print coordlist
https://docs.python.org/2/library/csv.html
csv.reader can take a delimiter as a parameter:
So you could simply fix this by using csv.reader(log, delimiter=' ')
You haven't made clear if you want both delimiters to be in use. But in order to get values seperated with whitespace you should change this line:
csvReader = csv.reader(log)
to
csvReader = csv.reader(log, delimiter=' ')