I'm trying to create a csv file that contains the contents of a list of strings in Python, using the script below. However when I check my output file, it turns out that every character is delimited by a comma. How can I instruct csv.writer to delimit every individual string within the list rather than every character?
import csv
RESULTS = ['apple','cherry','orange','pineapple','strawberry']
result_file = open("output.csv",'wb')
wr = csv.writer(result_file, dialect='excel')
for item in RESULTS:
wr.writerow(item)
I checked PEP 305 and couldn't find anything specific.
The csv.writer writerow method takes an iterable as an argument. Your result set has to be a list (rows) of lists (columns).
csvwriter.writerow(row)
Write the row parameter to the writer’s file object, formatted according to the current dialect.
Do either:
import csv
RESULTS = [
['apple','cherry','orange','pineapple','strawberry']
]
with open('output.csv','w') as result_file:
wr = csv.writer(result_file, dialect='excel')
wr.writerows(RESULTS)
or:
import csv
RESULT = ['apple','cherry','orange','pineapple','strawberry']
with open('output.csv','w') as result_file:
wr = csv.writer(result_file, dialect='excel')
wr.writerow(RESULT)
Very simple to fix, you just need to turn the parameter to writerow into a list.
for item in RESULTS:
wr.writerow([item])
I know I'm a little late, but something I found that works (and doesn't require using csv) is to write a for loop that writes to your file for every element in your list.
# Define Data
RESULTS = ['apple','cherry','orange','pineapple','strawberry']
# Open File
resultFyle = open("output.csv",'w')
# Write data to file
for r in RESULTS:
resultFyle.write(r + "\n")
resultFyle.close()
I don't know if this solution is any better than the ones already offered, but it more closely reflects your original logic so I thought I'd share.
A sample - write multiple rows with boolean column (using example above by GaretJax and Eran?).
import csv
RESULT = [['IsBerry','FruitName'],
[False,'apple'],
[True, 'cherry'],
[False,'orange'],
[False,'pineapple'],
[True, 'strawberry']]
with open("../datasets/dashdb.csv", 'wb') as resultFile:
wr = csv.writer(resultFile, dialect='excel')
wr.writerows(RESULT)
Result:
df_data_4 = pd.read_csv('../datasets/dashdb.csv')
df_data_4.head()
Output:
IsBerry FruitName
0 False apple
1 True cherry
2 False orange
3 False pineapple
4 True strawberry
Related
I am learning how to import information into .csv files with Python. I have the following code:
import csv
outputFile = open('output.csv','w',newline = '')
outputWriter = csv.writer(outputFile)
outputWriter.writerow(['spam','eggs','bacon','ham'])
outputWriter.writerow(['Hello World!','eggs','bacon','ham'])
outputWriter.writerow([1,2,3.141592,4])
outputFile.close()
My csv file looks like this:
Why does it output in 3 individual rows instead of overwriting the 1st row each time? How would I get it to overwrite if I wanted to?
Thank you for your insight from a beginner.
You can use seek() to go back to the beginning of the file before writing each row.
import csv
outputFile = open('output.csv','w',newline = '')
outputWriter = csv.writer(outputFile)
outputWriter.writerow(['spam','eggs','bacon','ham'])
outputFile.seek(0)
outputWriter.writerow(['Hello World!','eggs','bacon','ham'])
outputFile.seek(0)
outputWriter.writerow([1,2,3.141592,4])
outputFile.truncate() # clear out any remnants of previous lines.
outputFile.close()
I believe that behavior is from the csv library, if you want to write in one line, you could try this:
with open("output.csv","w") as writer:
writer.write(",".join(['spam','eggs','bacon','ham']))
writer.write(",".join(['Hello World!','eggs','bacon','ham']))
writer.write(",".join(['1','2','3.141592','4']))
Additionally, if you want to use CSV library you could try to do this:
import csv
# Create 1 list from the other 3
list1 = ['spam','eggs','bacon','ham']
list2 = ['Hello World!','eggs','bacon','ham']
list3 = [1,2,3.141592,4]
completeList = list1 + list2 + list3
# Write that list into one line
outputFile = open('output.csv','w',newline = '')
outputWriter = csv.writer(outputFile)
outputWriter.writerow(completeList)
Edit:
This is the result I got from both approaches.
If you are not reaching this result. It is probable that your list elements contain some line break "\n" inside them. I am not sure where you are reading your lists as inputs. If you have any input it would be worthy if you post it as well.
I am trying to print the output of a webscrape project into a CSV file.
So for example I have this list of supplier names under a list called SUPP_NAME: (just an example, the actual list has 50 items inside it)
['"FULIAN\\u0020\\u0028M\\u0029\\u0020SENDIRIAN\\u0020BERHAD"', '"RISO\\u0020SEKKEN\\u0020SDN.\\u0020BHD."', '"NATURE\\u0020PROFUSION\\u0020SDN.\\u0020BHD."']
and a list of numbers indicated years, under a list called SUPP_YEARS:
['"9"', '"4"', '"1"', '"1"']
My plan is to put them into a CSV, and then read them back in as a pandas dataframe, then perform decoding to get a bunch of values.
Code so far:
import csv
with open('output3.csv' , 'w') as f:
writer = csv.writer(f)
headers = "Supplier_name,Years\n"
f.write(headers)
supp_names = re.findall(r'("supplierName"):("\w+.+")', results[17].text)
supp_years = re.findall(r'("supplierYear"):("\d+")', results[17].text)
SUPP_NAME = []
for title, name in supp_names:
print (name)
SUPP_NAME.append(name)
#f.write(name + "\n")
SUPP_YEAR = []
for year,number in supp_years:
print (number)
SUPP_YEAR.append(number)
#f.write(number + "\n")
writer.writerow([SUPP_NAME, SUPP_YEAR])
However, what I get is that under the Supplier_name and Years columns, one cell under each of these 2 columns is filled with a LONG list of items still contained in the list, instead of the items separated one by one.
What am I doing wrong? Thanks in advance for answering.
The two re.findall() calls are giving you lists of items (hopefully both the same length). The idea is to then then extract an element from each and write this to your output file. Python has a useful function called zip() to do this. You give it both of your lists and the loop with give you an item from each on each iteration:
import csv
with open('output3.csv', 'w' newline='') as f_output:
writer = csv.writer(f_output)
writer.writerow(["Supplier_name" , "Years"])
supp_names = re.findall(r'("supplierName"):("\w+.+")', results[17].text)
supp_years = re.findall(r'("supplierYear"):("\d+")', results[17].text)
for name, year in zip(supp_names, supp_years):
writer.writerow([name, year])
The csv.writer() object is designed to take a list of items and write them to your file with the desired (i.e. comma) delimiter automatically added between them.
I assume you are using Python 3.x? If not you should change the following:
with open('output3.csv', 'wb') as f_output:
I'm looking for a way using python to copy the first column from a csv into an empty file. I'm trying to learn python so any help would be great!
So if this is test.csv
A 32
D 21
C 2
B 20
I want this output
A
D
C
B
I've tried the following commands in python but the output file is empty
f= open("test.csv",'r')
import csv
reader = csv.reader(f,delimiter="\t")
names=""
for each_line in reader:
names=each_line[0]
First, you want to open your files. A good practice is to use the with statement (that, technically speaking, introduces a context manager) so that when your code exits from the with block all the files are automatically closed
with open('test.csv') as inpfile, open('out.csv', 'w') as outfile:
next you want a loop on the lines of the input file (note the indentation, we are inside the with block), line splitting is automatic when you read a text file with lines separated by newlines…
for line in inpfile:
each line is a string, but you think of it as two fields separated by white space — this situation is so common that strings have a method to deal with this situation (note again the increasing indent, we are in the for loop block)
fields = line.split()
by default .split() splits on white space, but you can use, e.g., split(',') to split on commas, etc — that said, fields is a list of strings, for your first record it is equal to ['A', '32'] and you want to output just the first field in this list… for this purpose a file object has the .write() method, that writes a string, just a string, to the file, and fields[0] IS a string, but we have to add a newline character to it because, in this respect, .write() is different from print().
outfile.write(fields[0]+'\n')
That's all, but if you omit my comments it's 4 lines of code
with open('test.csv') as inpfile, open('out.csv', 'w') as outfile:
for line in inpfile:
fields = line.split()
outfile.write(fields[0]+'\n')
When you are done with learning (some) Python, ask for an explanation of this...
with open('test.csv') as ifl, open('out.csv', 'w') as ofl:
ofl.write('\n'.join(line.split()[0] for line in ifl))
Addendum
The csv module in such a simple case adds the additional conveniences of
auto-splitting each line into a list of strings
taking care of the details of output (newlines, etc)
and when learning Python it's more fruitful to see how these steps can be done using the bare language, or at least that it is my opinion…
The situation is different when your data file is complex, has headers, has quoted strings possibly containing quoted delimiters etc etc, in those cases the use of csv is recommended, as it takes into account all the gory details. For complex data analisys requirements you will need other packages, not included in the standard library, e.g., numpy and pandas, but that is another story.
This answer reads the CSV file, understanding a column to be demarked by a space character. You have to add the header=None otherwise the first row will be taken to be the header / names of columns.
ss is a slice - the 0th column, taking all rows as denoted by :
The last line writes the slice to a new filename.
import pandas as pd
df = pd.read_csv('test.csv', sep=' ', header=None)
ss = df.ix[:, 0]
ss.to_csv('new_path.csv', sep=' ', index=False)
import csv
reader = csv.reader(open("test.csv","rb"), delimiter='\t')
writer = csv.writer(open("output.csv","wb"))
for e in reader:
writer.writerow(e[0])
The best you can do is create a empty list and append the column and then write that new list into another csv for example:
import csv
def writetocsv(l):
#convert the set to the list
b = list(l)
print (b)
with open("newfile.csv",'w',newline='',) as f:
w = csv.writer(f, delimiter=',')
for value in b:
w.writerow([value])
adcb_list = []
f= open("test.csv",'r')
reader = csv.reader(f,delimiter="\t")
for each_line in reader:
adcb_list.append(each_line)
writetocsv(adcb_list)
hope this works for you :-)
I want to create a csv from an existing csv, by splitting its rows.
Input csv:
A,R,T,11,12,13,14,15,21,22,23,24,25
Output csv:
A,R,T,11,12,13,14,15
A,R,T,21,22,23,24,25
So far my code looks like:
def update_csv(name):
#load csv file
file_ = open(name, 'rb')
#init first values
current_a = ""
current_r = ""
current_first_time = ""
file_content = csv.reader(file_)
#LOOP
for row in file_content:
current_a = row[0]
current_r = row[1]
current_first_time = row[2]
i = 2
#Write row to new csv
with open("updated_"+name, 'wb') as f:
writer = csv.writer(f)
writer.writerow((current_a,
current_r,
current_first_time,
",".join((row[x] for x in range(i+1,i+5)))
))
#do only one row, for debug purposes
return
But the row contains double quotes that I can't get rid of:
A002,R051,02-00-00,"05-21-11,00:00:00,REGULAR,003169391"
I've tried to use writer = csv.writer(f,quoting=csv.QUOTE_NONE) and got a _csv.Error: need to escape, but no escapechar set.
What is the correct approach to delete those quotes?
I think you could simplify the logic to split each row into two using something along these lines:
def update_csv(name):
with open(name, 'rb') as file_:
with open("updated_"+name, 'wb') as f:
writer = csv.writer(f)
# read one row from input csv
for row in csv.reader(file_):
# write 2 rows to new csv
writer.writerow(row[:8])
writer.writerow(row[:3] + row[8:])
writer.writerow is expecting an iterable such that it can write each item within the iterable as one item, separate by the appropriate delimiter, into the file. So:
writer.writerow([1, 2, 3])
would write "1,2,3\n" to the file.
Your call provides it with an iterable, one of whose items is a string that already contains the delimiter. It therefore needs some way to either escape the delimiter or a way to quote out that item. For example,
write.writerow([1, '2,3'])
Doesn't just give "1,2,3\n", but e.g. '1,"2,3"\n' - the string counts as one item in the output.
Therefore if you want to not have quotes in the output, you need to provide an escape character (e.g. '/') to mark the delimiters that shouldn't be counted as such (giving something like "1,2/,3\n").
However, I think what you actually want to do is include all of those elements as separate items. Don't ",".join(...) them yourself, try:
writer.writerow((current_a, current_r,
current_first_time, *row[i+2:i+5]))
to provide the relevant items from row as separate items in the tuple.
I would like to use the Python CSV module to open a CSV file for appending. Then, from a list of CSV files, I would like to read each csv file and write it to the appended CSV file. My script works great - except that I cannot find a way to remove the headers from all but the first CSV file being read. I am certain that my else block of code is not executing properly. Perhaps my syntax for my if else code is the problem? Any thoughts would be appreciated.
writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
for files in lstFiles:
readFile = open(input_file,'rU')
reader = csv.reader(readFile,dialect='excel')
for i in range(0,len(lstFiles)):
if i == 0:
oldHeader = readFile.readline()
newHeader = writeFile.write(oldHeader)
for row in reader:
writer.writerow(row)
else:
reader.next()
for row in reader:
row = readFile.readlines()
writer.writerow(row)
readFile.close()
writeFile.close()
You're effectively iterating over lstFiles twice. For each file in your list, you're running your inner for loop up from 0. You want something like:
writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
headers_needed = True
for input_file in lstFiles:
readFile = open(input_file,'rU')
reader = csv.reader(readFile,dialect='excel')
oldHeader = reader.next()
if headers_needed:
newHeader = writer.writerow(oldHeader)
headers_needed = False
for row in reader:
writer.writerow(row)
readFile.close()
writeFile.close()
You could also use enumerate over the lstFiles to iterate over tuples containing the iteration count and the filename, but I think the boolean shows the logic more clearly.
You probably do not want to mix iterating over the csv reader and directly calling readline on the underlying file.
I think you're iterating too many times (over various things: both your list of files and the files themselves). You've definitely got some consistency problems; it's a little hard to be sure since we can't see your variable initializations. This is what I think you want:
with open(append_file,'a+b') as writeFile:
need_headers = True
for input_file in lstFiles:
with open(input_file,'rU') as readFile:
headers = readFile.readline()
if need_headers:
# Write the headers only if we need them
writeFile.write(headers)
need_headers = False
# Now write the rest of the input file.
for line in readFile:
writeFile.write(line)
I took out all the csv-specific stuff since there's no reason to use it for this operation. I also cleaned the code up considerably to make it easier to follow, using the files as context managers and a well-named boolean instead of the "magic" i == 0 check. The result is a much nicer block of code that (hopefully) won't have you jumping through hoops to understand what's going on.