How to format txt file in Python - python

I am trying to convert a txt file into a csv file in Python. The current format of the txt file are several strings separated by spaces. I would like to write each string into one cell in the csv file.
The txt file has got following structure:
UserID Desktop Display (Version) (Server/Port handle), Date
UserID Desktop Display (Version) (Server/Port handle), Date
etc.
My approach would be following:
with open('licfile.txt', "r+") as in_file:
stripped = (line.strip() for line in in_file)
lines = (line.split(" ") for line in stripped if line)
with open('licfile.csv', 'w') as out_file:
writer = csv.writer(out_file)
writer.writerow(('user', 'desktop', 'display', 'version', 'server', 'handle', 'date'))
writer.writerows(lines)
Unfortunately this is not working as expected. I do get following ValueError: I/O operation on closed file. Additionally only the intended row headers are shown in one cell in the csv file.
Any tips on how to proceed? Many thanks in advance.

how about
with open('licfile.txt', 'r') as in_file, open('licfile.csv', 'w') as out_file:
for line in in_file:
if line.strip():
out_file.write(line.strip().replace(' ', ',') + '\n')
and for the german Excel enthusiasts...
...
...
...
... .replace(' ', ';') + '\n')
:)

You can also use the built in csv module to accomplish this easily:
import csv
with open('licfile.txt', 'r') as in_file, open('licfile.csv', 'w') as out_file:
reader = csv.reader(in_file, delimiter=" ")
writer = csv.writer(out_file, lineterminator='\n')
writer.writerows(reader)
I used lineterminator='\n' argument here as the default is \r\n and it ends up giving you an extra line of return per row in most cases.
There are also a few arguments you could use if say quoting is needed or a different delimiter is desired: https://docs.python.org/3/library/csv.html#csv-fmt-params

You are using comprehension with round brackets which will cause to create tuple object. Instead of that just use square bracket which will return list. see below example:
stripped = [line.strip() for line in in_file]
lines = [line.split(" ") for line in stripped if line]

licfile_df = pd.read_csv('licfile.txt',sep=",", header=None)

Related

Python remove ' from string when using CSV writer

I managed with converting the txt file to .csv with python.
However, now I don't know how to remove the quotes enclosing all strings in my CSV file.
I tried the following code:
import csv
with open('UPRN.txt', 'r') as in_file:
stripped = (line.strip() for line in in_file)
lines = (line.split(",") for line in stripped if line)
with open('UPRN.csv', 'w', newline='') as out_file:
writer = csv.writer(out_file)
writer.writerow(('Name', 'UPRN','ADMIN_AREA','TOWN','STREET','NAME_NUMBER'))
writer.writerows(lines)
for lines in writer:
lines = [x.replace("'","") if x == '*' else x for x in row]
writer.writerow(lines)
but I am getting an error:
TypeError: '_csv.writer' object is not iterable
The easiest way could be:
Remove quotes from String in Python
but the CSV writer has no attributes like write, replace, etc.
'_csv.writer' object has no attribute 'write'
Moreover, I am not sure if a wildcard is needed here:
Python wildcard search in string
Is there any quick way of removing the quotes when the CSV module is imported?
I think you should rather iterate on your lines list,
with open('UPRN.txt', 'r') as in_file:
lines = [line.strip().replace("'","") for line in in_file]
with open('UPRN.csv', 'w', newline='') as out_file:
writer = csv.writer(out_file)
writer.writerow(('Name', 'UPRN','ADMIN_AREA','TOWN','STREET','NAME_NUMBER'))
for line in lines:
writer.writerow(line.split(","))

Converting a Text .txt document to CSV .csv Using a Delimiter

I'd like to create a CSV from a TXT file. I have a text file with lines (300 lines+) separated by backslashes. I'd like each line to be a separate row, and each backslash to be a separate new column.
The text file looks like:
example 1\example 2\example 3\example 4
test 1\test 2\test 3\test 4
I'd like the CSV to look like:
Example 1
Example 2
Example 3
Example 4
Test 1
Test 2
Test 3
Test 4
So far I have:
import csv
with open('Report.txt') as report:
report_txt = report.read()
with open('Report.csv','w',newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(report_txt)
I know I need to use \ as a delimiter, but I'm not sure how. Thanks for any help!
Define your delimiter like this (escape the \):
reader = csv.reader(open("Report.csv"), delimiter="\\")
Code:
import csv
with open('Report.txt') as report:
reader = csv.reader(report, delimiter="\\")
with open('Report_output.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
for line in reader:
writer.writerow(line)
First you got to split the string based on the delimeter. You can achieve this by using the split operator or regex.
import csv
with open('file.txt', 'r') as in_file:
stripped = (line.strip() for line in in_file)
lines = (line.split("\\") for line in stripped if line)
Then pretty much write it to the csv.
with open('report.csv', 'w') as out_file:
writer = csv.writer(out_file)
writer.writerows(lines)
Tweak your code accordingly. The concept is pretty much the same. Note the double backslash is to account for the escape character.
If you are just trying to convert that text into CSV, you can just replace every "\" character with ";" and you'll have a valid CSV file.
Else, if you want to do something with the parsed data before reexporting to CSV, you can read the file line by line and use the split() Method with "\", then rejoin and write line by line, like here:
with open('in.txt') as input_file:
with open('out.csv','a') as output_file:
txt_line = input_file.readline()
while txt_line:
cells = txt_line.split("\\")
# Do something with each cell...
csv_line = ";".join(cells)
output_file.write(csv_line)
txt_line = input_file.readline()

Inserting a comma in between columns in text tile

The problem is I have this text, csv file which is missing commas and I would like to insert it in order to run the file on LaTex and make a table. I have a MWE of a code from another problem which I ran and it did not work. Is it possible someone could guide me on how to change it.
I have used a Python code which provides a blank file, and another one which provides a blank document, and another which removes the spaces.
import fileinput
input_file = 'C:/Users/Light_Wisdom/Documents/Python Notes/test.txt'
output= open('out.txt','w+')
with open('out.txt', 'w+') as output:
for each_line in fileinput.input(input_file):
output.write("\n".join(x.strip() for x in each_line.split(',')))
text file contains more numbers but its like this
0 2.58612
0.00616025 2.20018
0.0123205 1.56186
0.0184807 0.371172
0.024641 0.327379
0.0308012 0.368863
0.0369615 0.322228
0.0431217 0.171899
Outcome
0.049282, -0.0635003
0.0554422, -0.110747
0.0616025, 0.0701394
0.0677627, 0.202381
0.073923, 0.241264
0.0800832, 0.193697
Renewed Attempt:
with open("CSV.txt","r") as file:
new = list(map(lambda x: ''.join(x.split()[0:1]+[","]+x.split()[0:2]),file.readlines()))
with open("New_CSV.txt","w+") as output:
for i in new:
output.writelines(i)
output.writelines("\n")
This can be using .split and .join by splitting the line into a list and then joining the list separated by commas. This enables us to handle several subsequent spaces in the file:
f1 = open(input_file, "r")
with open("out.txt", 'w') as f2:
for line in f1:
f2.write(",".join(line.split()) + "\n")
f1.close()
You can also use csv to handle the writing automatically:
import csv
f1 = open(input_file, "r")
with open("out.txt", 'w') as f2:
writer = csv.writer(f2)
for line in f1:
writer.writerow(line.split())
f1.close()

Python - save csv file with tab separated words in separate cell

I have this input file:
one\tone
two\ttwo
three\tthree
With a tab between each word.
I am trying to save it in a csv file where each word ends up in its own cell. This is my code:
import csv
input = open('input.txt').read()
lines = input.split('\n')
with open('output.csv', 'w') as f:
writer = csv.writer(f)
for line in lines:
writer.writerow([line])
However, both words end up in the same cell:
How do I change the code so that each word ends up in its own cell?
Try this:
import csv
input = open('input.txt').read()
lines = input.split('\n')
with open('output.csv', 'w') as f:
writer = csv.writer(f)
for line in lines:
writer.writerow(line.split('\t'))
The writerow method in the CSV writer library takes a list of columns.
Currently, you are providing your whole string the value of the first column
writer.writerow([line])
Instead, try splitting the string by \t, thus creating a list of each individual word and provide that to the library instead.
writer.writerow(line.split("\t"))
You need to split the input lines into a list, so that csv.writer() will put them into seperate columns. Try:
with open('output.csv', 'w') as f:
writer = csv.writer(f)
for line in lines:
writer.writerow(line.split('\t'))

Overwrite the first and last column in csv file using python

I am new to data processing using CSV module. And i have input file And using this code`
import csv
path1 = "C:\\Users\\apple\\Downloads\\Challenge\\raw\\charity.a.data"
csv_file_path = "C:\\Users\\apple\\Downloads\\Challenge\\raw\\output.csv.bak"
with open(path1, 'r') as in_file:
in_file.__next__()
stripped = (line.strip() for line in in_file)
lines = (line.split(":$%:") for line in stripped if line)
with open(csv_file_path, 'w') as out_file:
writer = csv.writer(out_file)
writer.writerow(('id', 'donor_id','last_name','first_name','year','city','state','postal_code','gift_amount'))
writer.writerows(lines)
`
Is it possible to remove (:) in the first and last column of csv file. And i want output be like
Please help me.
If you just want to eliminate the ':' at the first and last column, this should work. Keep in mind that your dataset should be tab (or something other than comma) separated before you read it, because as I commented in your question, there are commas ',' in your dataset.
path1 = '/path/input.csv'
path2 = '/path/output.csv'
with open(path1, 'r') as input, open(path2, 'w') as output:
file = iter(input.readlines())
output.write(next(file))
for row in file:
output.write(row[1:][:-2] + '\n')
Update
So after giving your code, I added a small change to do the whole process starting from the initial file. The idea is the same. You should just exclude the first and the last char of each line. So instead of line.strip() you should have line.strip()[1:][:-2].
import csv
path1 = "C:\\Users\\apple\\Downloads\\Challenge\\raw\\charity.a.data"
csv_file_path = "C:\\Users\\apple\\Downloads\\Challenge\\raw\\output.csv.bak"
with open(path1, 'r') as in_file:
in_file.__next__()
stripped = (line.strip()[1:][:-2] for line in in_file)
lines = (line.split(":$%:") for line in stripped if line)
with open(csv_file_path, 'w') as out_file:
writer = csv.writer(out_file)
writer.writerow(('id', 'donor_id','last_name','first_name','year','city','state','postal_code','gift_amount'))
writer.writerows(lines)

Categories