Im trying to write to a CSV-File, the output should look like this:
meist,L, ,meist (30), meisten (95)
But It looks like this, the joined strings get ":
meist,L, ,"meist (30), meisten (95)"
My code is the following one:
with open(dest_csv_file, 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=['wortstamm', 'words'])
writer.writeheader()
for wortstamm in self.ausgabe:
words = []
for word in self.ausgabe[wortstamm]["w"]:
words.append('' + word + ' (' + str(self.ausgabe[wortstamm]["w"][word]) + ')')
words_string = ', '.join(words)
writer.writerow({'wortstamm': wortstamm, 'words': words_string})
How do I get my code to write the desired output? What do I have to change? Thanks for your help!
If one of the fields you give to the csv writer contains a ,, it will be wrapped by quotes so that it will not be considered as multiple fields. If you have multiple fields, you should pass them in separately instead of joining them with a comma as in your code.
Since it seems that your fields come from a list, it may be easier to use csv.writer (which is given a list of fields for each row) instead of csv.DictWriter (which is given a dictionary where each field is specified by name).
Example:
writer = csv.writer(csvfile)
for wortstamm in self.ausgabe:
words = ....
writer.writerow([wortstamm] + words)
Related
I am currently fetching data from an API and I would like to store those data as csv.
However, some lines are always invalid which means I cannot split them via Excel's text-in-columns functionality.
I create the csv file as follows:
with open(directory_path + '/' + file_name + '-data.csv', 'a', newline='') as file:
# Setup a writer
csvwriter = csv.writer(file, delimiter='|')
# Write headline row
if not headline_exists:
csvwriter.writerow(['Title', 'Text', 'Tip'])
# Build the data row
record = data['title'] + '|' + data['text'] + '|' + data['tip']
csvwriter.writerow([record])
If you open the csv file in Excel you also immediately see that the row is invalid. While the valid one takes the default height and the whole width, the invalid one takes more height but less width.
Does anyone know a reason for that problem?
The rows are not invalid, but what you do is.
So first of all: You use pipes as delimeters. Its fine in some scenarios, but given the fact you want to load it into excel immediately it seems wiser to me to export the data in "excel" dialect:
csvwriter = csv.writer(file, dialect='excel')
Second, look at the following lines:
record = data['title'] + '|' + data['text'] + '|' + data['tip']
csvwriter.writerow([record])
This way you basically tell the csv writer that you want a single column, with pipes in it. If you use a csv writer you must not concatenate the delimeters on your own, it voids the point of using a writer. So this is how it should be done instead:
record = [data['title'], data['text'], data['tip']]
csvwriter.writerow(record)
Hope it helps.
I have finally found out that I had to strip the text and the tip because they sometimes contain whitespaces which would break the format.
Additionally, I also followed the recommendation to use the excel dialect since I think this will make it easier to process the data later on.
I have the following code:
f = open("test.tsv", 'wt')
try:
writer = csv.writer(f, delimiter='\t')
for x in range(0, len(ordered)):
writer.writerow((ordered[x][0],"\t", ordered[x][1]))
finally:
f.close()
I need the TSV file to have the ordered[x][0] separated by two tabs with ordered[x][1]
the "\t" adds space, but its not a tab and the parentheses are shown on the output.
Thank You!
You could replace the "\t" by "" to obtain what you want:
writer.writerow((ordered[x][0],"", ordered[x][1]))
Indeed, the empty string in the middle will then be surrounded by a tab on both sides, effectively putting two tabs between ordered[x][0] and ordered[x][1].
However, a more natural code doing exactly the same thing would be:
with open("test.tsv", "w") as fh:
for e in ordered:
fh.write("\t\t".join(map(str, e[:2])) + "\n")
where I:
used a with statement (explained here) instead of the try ... finally construct
removed the t mode in the open function (t is the default behavior)
iterated over the elements in ordered using for ... in instead of using an index
used join instead of a csv writer: those are suited in cases where the delimiter is a single character
I want to create a csv from an existing csv, by splitting its rows.
Input csv:
A,R,T,11,12,13,14,15,21,22,23,24,25
Output csv:
A,R,T,11,12,13,14,15
A,R,T,21,22,23,24,25
So far my code looks like:
def update_csv(name):
#load csv file
file_ = open(name, 'rb')
#init first values
current_a = ""
current_r = ""
current_first_time = ""
file_content = csv.reader(file_)
#LOOP
for row in file_content:
current_a = row[0]
current_r = row[1]
current_first_time = row[2]
i = 2
#Write row to new csv
with open("updated_"+name, 'wb') as f:
writer = csv.writer(f)
writer.writerow((current_a,
current_r,
current_first_time,
",".join((row[x] for x in range(i+1,i+5)))
))
#do only one row, for debug purposes
return
But the row contains double quotes that I can't get rid of:
A002,R051,02-00-00,"05-21-11,00:00:00,REGULAR,003169391"
I've tried to use writer = csv.writer(f,quoting=csv.QUOTE_NONE) and got a _csv.Error: need to escape, but no escapechar set.
What is the correct approach to delete those quotes?
I think you could simplify the logic to split each row into two using something along these lines:
def update_csv(name):
with open(name, 'rb') as file_:
with open("updated_"+name, 'wb') as f:
writer = csv.writer(f)
# read one row from input csv
for row in csv.reader(file_):
# write 2 rows to new csv
writer.writerow(row[:8])
writer.writerow(row[:3] + row[8:])
writer.writerow is expecting an iterable such that it can write each item within the iterable as one item, separate by the appropriate delimiter, into the file. So:
writer.writerow([1, 2, 3])
would write "1,2,3\n" to the file.
Your call provides it with an iterable, one of whose items is a string that already contains the delimiter. It therefore needs some way to either escape the delimiter or a way to quote out that item. For example,
write.writerow([1, '2,3'])
Doesn't just give "1,2,3\n", but e.g. '1,"2,3"\n' - the string counts as one item in the output.
Therefore if you want to not have quotes in the output, you need to provide an escape character (e.g. '/') to mark the delimiters that shouldn't be counted as such (giving something like "1,2/,3\n").
However, I think what you actually want to do is include all of those elements as separate items. Don't ",".join(...) them yourself, try:
writer.writerow((current_a, current_r,
current_first_time, *row[i+2:i+5]))
to provide the relevant items from row as separate items in the tuple.
I spent all day trying to find a solution for this problem and as I didn't find one yet I'm coming here for you help!
I'm writing a CSV file exporter on python and I can't get it to output exactly how I want it to
let's say I have a list of strings containing several tutorial titles:
tutorials = ['tut1', 'tut2, 'tut3']
and I want to print in the csv file's first line like this:
First Name, Surname, StudentID, tut1, tut2, tut3
Now, this is the code I wrote to do so:
tutStr= ' '
for i tutorials:
tutStr += i + ', '
tutStr = tutStr[0:-3]
with open('CSVexport.csv', 'wb') as csvFile:
writer = csv.writer(csvFile, delimiter=',', quoting=csv.QUOTE_NONE)
writer.writerow(['First Name'+'Surname'+'ID number'+tutStr)
This gives me the error:
"writer.writerow(['First Name'+'Surname'+'ID number'+labString])
Error: need to escape, but no escapechar set"
now, I know that the error has to do with csv.QUOTE_NONE and that I have to set an escapechar but I really don't understand how am I supposed to do so.
if I try without the QUOTE_NONE it'll automaticalle set the quotechar to ' " ' which will give me this output
"First Name, Surname, StudentID, tut1, tut2, tut3" --> don't want quotation marks!!!
Any idea?
If you are going to write something with writerow you need to pass a list there, not a list containing one big string separated by commas (it will appear as a string in one cell). quoting=csv.QUOTE_MINIMAL would help to quote strings that need to be quoted.
tutorials = ['tut1', 'tut2', 'tut3']
headers = ['First Name', 'Surname', 'ID number'] + tutorials
first_line = True
with open('CSVexport.csv', 'w') as csvFile:
writer = csv.writer(csvFile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
if first_line:
writer.writerow(headers)
first_line = False
Objective: To extract the text from the anchor tag inside all lines in models and put it in a csv.
I'm trying this code:
with open('Sprint_data.csv', 'ab') as csvfile:
spamwriter = csv.writer(csvfile)
models = soup.find_all('li' , {"class" : "phoneListing"})
for model in models:
model_name = unicode(u' '.join(model.a.stripped_strings)).encode('utf8').strip()
spamwriter.writerow(unicode(u' '.join(model.a.stripped_strings)).encode('utf8').strip())
It's working fine except each cell in the csv contains only one character.
Like this:
| S | A | M | S | U | N | G |
Instead of:
|SAMSUNG|
Of course I'm missing something. But what?
.writerow() requires a sequence ('', (), []) and places each index in it's own column of the row, sequentially. If your desired string is not an item in a sequence, writerow() will iterate over each letter in your string and each will be written to your CSV in a separate cell.
after you import csv
If this is your list:
myList = ['Diamond', 'Sierra', 'Crystal', 'Bridget', 'Chastity', 'Jasmyn', 'Misty', 'Angel', 'Dakota', 'Asia', 'Desiree', 'Monique', 'Tatiana']
listFile = open('Names.csv', 'wb')
writer = csv.writer(listFile)
for item in myList:
writer.writerow(item)
The above script will produce the following CSV:
Names.csv
D,i,a,m,o,n,d
S,i,e,r,r,a
C,r,y,s,t,a,l
B,r,i,d,g,e,t
C,h,a,s,t,i,t,y
J,a,s,m,y,n
M,i,s,t,y
A,n,g,e,l
D,a,k,o,t,a
A,s,i,a
D,e,s,i,r,e,e
M,o,n,i,q,u,e
T,a,t,i,a,n,a
If you want each name in it's own cell, the solution is to simply place your string (item) in a sequence. Here I use square brackets []. :
listFile2 = open('Names2.csv', 'wb')
writer2 = csv.writer(listFile2)
for item in myList:
writer2.writerow([item])
The script with .writerow([item]) produces the desired results:
Names2.csv
Diamond
Sierra
Crystal
Bridget
Chastity
Jasmyn
Misty
Angel
Dakota
Asia
Desiree
Monique
Tatiana
writerow accepts a sequence. You're giving it a single string, so it's treating that as a sequence, and strings act like sequences of characters.
What else do you want in this row? Nothing? If so, make it a list of one item:
spamwriter.writerow([u' '.join(model.a.stripped_strings).encode('utf8').strip()])
(By the way, the unicode() call is completely unnecessary since you're already joining with a unicode delimiter.)
This is usually the solution I use:
import csv
with open("output.csv", 'w', newline= '') as output:
wr = csv.writer(output, dialect='excel')
for element in list_of_things:
wr.writerow([element])
output.close()
This should provide you with an output of all your list elements in a single column rather than a single row.
Key points here is to iterate over the list and use '[list]' to avoid the csvwriter sequencing issues.
Hope this is of use!
Just surround it with a list sign (i.e [])
writer.writerow([str(one_column_value)])