I'm writing data from a PDF to a CSV. The CSV needs to have one column, with each word on a separate row.
The code below writes each word on a separate row, but also puts each letter in a separate cell.
with open('annualreport.csv', 'w', encoding='utf-8') as f:
write = csv.writer(f)
for i in keywords:
write.writerow(i)
I have also attempted the following, which writes all the words to one row, with each word in a separate column:
with open('annualreport.csv', 'w', encoding='utf-8') as f:
write = csv.writer(f)
write.writerow(keywords)
As far as I know, writerow expects an array. Thus a word is treated as an array with the individual letters -> each letter is written into a new cell.
Putting the value into a single array should fix the problem:
with open('annualreport.csv', 'w', encoding='utf-8') as f:
write = csv.writer(f)
for i in keywords:
write.writerow( [ i ] ) # <-- before: write.writerow(i)
import csv
# data to be written row-wise in csv fil
data = [['test'], [try], ['goal']]
# opening the csv file in 'w+' mode
file = open('output.csv', 'w+', newline ='')
# writing the data into the file
with file:
write = csv.writer(file)
write.writerows(data)
I'd like to create a CSV from a TXT file. I have a text file with lines (300 lines+) separated by backslashes. I'd like each line to be a separate row, and each backslash to be a separate new column.
The text file looks like:
example 1\example 2\example 3\example 4
test 1\test 2\test 3\test 4
I'd like the CSV to look like:
Example 1
Example 2
Example 3
Example 4
Test 1
Test 2
Test 3
Test 4
So far I have:
import csv
with open('Report.txt') as report:
report_txt = report.read()
with open('Report.csv','w',newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(report_txt)
I know I need to use \ as a delimiter, but I'm not sure how. Thanks for any help!
Define your delimiter like this (escape the \):
reader = csv.reader(open("Report.csv"), delimiter="\\")
Code:
import csv
with open('Report.txt') as report:
reader = csv.reader(report, delimiter="\\")
with open('Report_output.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
for line in reader:
writer.writerow(line)
First you got to split the string based on the delimeter. You can achieve this by using the split operator or regex.
import csv
with open('file.txt', 'r') as in_file:
stripped = (line.strip() for line in in_file)
lines = (line.split("\\") for line in stripped if line)
Then pretty much write it to the csv.
with open('report.csv', 'w') as out_file:
writer = csv.writer(out_file)
writer.writerows(lines)
Tweak your code accordingly. The concept is pretty much the same. Note the double backslash is to account for the escape character.
If you are just trying to convert that text into CSV, you can just replace every "\" character with ";" and you'll have a valid CSV file.
Else, if you want to do something with the parsed data before reexporting to CSV, you can read the file line by line and use the split() Method with "\", then rejoin and write line by line, like here:
with open('in.txt') as input_file:
with open('out.csv','a') as output_file:
txt_line = input_file.readline()
while txt_line:
cells = txt_line.split("\\")
# Do something with each cell...
csv_line = ";".join(cells)
output_file.write(csv_line)
txt_line = input_file.readline()
I am trying to convert a txt file into a csv file in Python. The current format of the txt file are several strings separated by spaces. I would like to write each string into one cell in the csv file.
The txt file has got following structure:
UserID Desktop Display (Version) (Server/Port handle), Date
UserID Desktop Display (Version) (Server/Port handle), Date
etc.
My approach would be following:
with open('licfile.txt', "r+") as in_file:
stripped = (line.strip() for line in in_file)
lines = (line.split(" ") for line in stripped if line)
with open('licfile.csv', 'w') as out_file:
writer = csv.writer(out_file)
writer.writerow(('user', 'desktop', 'display', 'version', 'server', 'handle', 'date'))
writer.writerows(lines)
Unfortunately this is not working as expected. I do get following ValueError: I/O operation on closed file. Additionally only the intended row headers are shown in one cell in the csv file.
Any tips on how to proceed? Many thanks in advance.
how about
with open('licfile.txt', 'r') as in_file, open('licfile.csv', 'w') as out_file:
for line in in_file:
if line.strip():
out_file.write(line.strip().replace(' ', ',') + '\n')
and for the german Excel enthusiasts...
...
...
...
... .replace(' ', ';') + '\n')
:)
You can also use the built in csv module to accomplish this easily:
import csv
with open('licfile.txt', 'r') as in_file, open('licfile.csv', 'w') as out_file:
reader = csv.reader(in_file, delimiter=" ")
writer = csv.writer(out_file, lineterminator='\n')
writer.writerows(reader)
I used lineterminator='\n' argument here as the default is \r\n and it ends up giving you an extra line of return per row in most cases.
There are also a few arguments you could use if say quoting is needed or a different delimiter is desired: https://docs.python.org/3/library/csv.html#csv-fmt-params
You are using comprehension with round brackets which will cause to create tuple object. Instead of that just use square bracket which will return list. see below example:
stripped = [line.strip() for line in in_file]
lines = [line.split(" ") for line in stripped if line]
licfile_df = pd.read_csv('licfile.txt',sep=",", header=None)
The problem is I have this text, csv file which is missing commas and I would like to insert it in order to run the file on LaTex and make a table. I have a MWE of a code from another problem which I ran and it did not work. Is it possible someone could guide me on how to change it.
I have used a Python code which provides a blank file, and another one which provides a blank document, and another which removes the spaces.
import fileinput
input_file = 'C:/Users/Light_Wisdom/Documents/Python Notes/test.txt'
output= open('out.txt','w+')
with open('out.txt', 'w+') as output:
for each_line in fileinput.input(input_file):
output.write("\n".join(x.strip() for x in each_line.split(',')))
text file contains more numbers but its like this
0 2.58612
0.00616025 2.20018
0.0123205 1.56186
0.0184807 0.371172
0.024641 0.327379
0.0308012 0.368863
0.0369615 0.322228
0.0431217 0.171899
Outcome
0.049282, -0.0635003
0.0554422, -0.110747
0.0616025, 0.0701394
0.0677627, 0.202381
0.073923, 0.241264
0.0800832, 0.193697
Renewed Attempt:
with open("CSV.txt","r") as file:
new = list(map(lambda x: ''.join(x.split()[0:1]+[","]+x.split()[0:2]),file.readlines()))
with open("New_CSV.txt","w+") as output:
for i in new:
output.writelines(i)
output.writelines("\n")
This can be using .split and .join by splitting the line into a list and then joining the list separated by commas. This enables us to handle several subsequent spaces in the file:
f1 = open(input_file, "r")
with open("out.txt", 'w') as f2:
for line in f1:
f2.write(",".join(line.split()) + "\n")
f1.close()
You can also use csv to handle the writing automatically:
import csv
f1 = open(input_file, "r")
with open("out.txt", 'w') as f2:
writer = csv.writer(f2)
for line in f1:
writer.writerow(line.split())
f1.close()
I am writing a script to write a list with tab separated as below to a csv file. But i am not getting proper output on this.
out_l = ['host\tuptime\tnfsserver\tnfs status\n', 'node1\t2\tnfs_host\tok\n', 'node2\t100\tnfs_host\tna\n', 'node3\t59\tnfs_host\tok\n']
code:
out_f = open('test.csv', 'w')
w = csv.writer(out_f)
for l in out_l:
w.writerow(l)
out_f.close()
The output csv file reads as below.
h,o,s,t, ,s,s,h, , , , , ,s,u,d,o,_,h,o,s,t, , , , , , , ,n,f,s,"
"1,9,2,.,1,6,8,.,1,2,2,.,2,0,1, ,o,k, ,n,f,s,h,o,s,t, ,o,k,"
"1,9,2,.,1,6,8,.,1,2,2,.,2,0,2, ,f,a,i,l,e,d, ,n,a, ,n,a,"
"1,9,2,.,1,6,8,.,1,2,2,.,2,0,3, ,o,k, ,n,f,s,h,o,s,t, ,s,h,o,w,m,o,u,n,t, ,f,a,i,l,e,d,"
"
Also I have checked the csv.writer option like delimiter, dialect=excel, but no luck.
Can some one help to format the output?
With the formatting you have in out_l, you can just write it to a file:
out_l = ['host\tuptime\tnfsserver\tnfs status\n', 'node1\t2\tnfs_host\tok\n', 'node2\t100\tnfs_host\tna\n', 'node3\t59\tnfs_host\tok\n']
with open('test.csv', 'w') as out_f:
for l in out_l:
out_f.write(l)
To properly use csv, out_l should just be lists of the columns and let the csv module do the formatting with tabs and newlines:
import csv
out_l = [['host','uptime','nfsserver','nfs status'],
['node1','2','nfs_host','ok'],
['node2','100','nfs_host','na'],
['node3','59','nfs_host','ok']]
#with open('test.csv', 'wb') as out_f: # Python 2
with open('test.csv', 'w', newline='') as out_f: # Python 3
w = csv.writer(out_f, delimiter='\t') # override for tab delimiter
w.writerows(out_l) # writerows (plural) doesn't need for loop
Note that with will automatically close the file.
See the csv documentation for the correct way to open a file for use with csv.reader or csv.writer.
The csv.Writer.writerow method takes an iterable and writes the values said iterable produces into the csv fields separated by the specified delimeter:
out_f = open('test.csv', 'w')
w = csv.writer(out_f, delimiter='\t') # set tab as delimiter
for l in out_l: # l is string (iterable of chars!)
w.writerow(l.split('\t')) # split to get the correct tokens
out_f.close()
As the strings in your list already contain the necessary tabs, you could just write them directly to the file, no csv tools needed. If you have built/joined the strings in out_l manually, you can omit that step and just pass the original data structure to writerow.
The delimiter parameter
The delimiter parameter controls the delimiter in the output. It has nothing to do with the input out_l.
Why your output is garbled
csv.writer.writerow iterates the input. In your case you are giving it a string (host\tuptime\tnfsserver\tnfs status\n', etc.), therefore the function iterates the string, giving you a sequence of chars.
How to produce the correct output
Give it a list of fields instead of the full string by using str.split(). In your case the string ends with \n, so use str.strip() as well:
import csv
out_l = ['host\tuptime\tnfsserver\tnfs status\n',
'node1\t2\tnfs_host\tok\n',
'node2\t100\tnfs_host\tna\n',
'node3\t59\tnfs_host\tok\n']
out_f = open('test.csv', 'w')
w = csv.writer(out_f)
for l in out_l:
w.writerow(l.strip().split('\t'))
out_f.close()
This should be what you want:
host,uptime,nfsserver,nfs status
node1,2,nfs_host,ok
node2,100,nfs_host,na
node3,59,nfs_host,ok
Reference: https://docs.python.org/3/library/csv.html
Very simple:
with open("test.csv" , 'w') as csv_file:
writer = csv.writer(csv_file, delemeter='\t')
for item in out_l:
writer.writerow([item,])