I'm trying to read sentences in a csv file, convert them to lowercase and save in other csv file.
import csv
import pprint
with open('dataset_elec_4000.csv') as f:
with open('output.csv', 'w') as ff:
data = f.read()
data = data.lower
writer = csv.writer(ff)
writer.writerow(data)
but I got error "_csv.Error: sequence expected". What should I do?
*I'm a beginner. Please be nice to me:)
You need to read over your input CSV row-by-row, and for each row, transform it, then write it out:
import csv
with open('output.csv', 'w', newline='') as f_out:
writer = csv.writer(f_out)
with open('dataset_elec_4000.csv', newline='') as f_in:
reader = csv.reader(f_in)
# comment these two lines if no input header
header = next(reader)
writer.writerow(header)
for row in reader:
# row is sequence/list of cells, so...
# select the cell with your sentence, I'm presuming it's the first cell (row[0])
data = row[0]
data = data.lower()
# need to put data back into a "row"
out_row = [data]
writer.writerow(out_row)
Python contains a module called csv for the handling of CSV files. The reader class from the module is used for reading data from a CSV file. At first, the CSV file is opened using the open() method in ‘r’ mode(specifies read mode while opening a file) which returns the file object then it is read by using the reader() method of CSV module that returns the reader object that iterates throughout the lines in the specified CSV document.
import csv
# opening the CSV file
with open('Giants.csv', mode ='r')as file:
# reading the CSV file
csvFile = csv.reader(file)
# displaying the contents of the CSV file
for lines in csvFile:
print(lines)
Related
I am trying to use the Python CSV reader to read a CSV file that I extract from a .tar.gz file using Python's tarfile library.
I have this:
tarFile = tarfile.open(name=tarFileName, mode="r")
for file in tarFile.getmembers():
tarredCSV = tarFile.extractfile(file)
reader = csv.reader(tarredCSV)
next(reader) # skip header
for row in reader:
if row[3] not in CSVRows.values():
CSVRows[row[3]] = row
All the files in the tar file are all CSVs.
I am getting an exception on the first file. I am getting this exception on the first next line:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
How do I open said file (without extracting the file then opening it)?
tarfile.extractfile returns an io.BufferedReader object, a bytes stream, and yet csv.reader expects a text stream. You can use io.TextIOWrapper to convert the bytes stream to a text stream instead:
import io
...
reader = csv.reader(io.TextIOWrapper(tarredCSV, encoding='utf-8'))
You need to provide a file-like object to csv.reader.
Probably the best solution, without having to consume a complete file at once is this approach (thanks to blhsing and damon for suggesting it):
import csv
import io
import tarfile
tarFile = tarfile.open(name=tarFileName, mode="r")
for file in tarFile.getmembers():
csv_file = io.TextIOWrapper(tarFile.extractfile(file), encoding="utf-8")
reader = csv.reader(csv_file)
next(reader) # skip header
for row in reader:
print(row)
Alternatively a possible solution from here: Python3 working with csv files in tar files would be
import csv
import io
import tarfile
tarFile = tarfile.open(name=tarFileName, mode="r")
for file in tarFile.getmembers():
csv_file = io.StringIO(tarFile.extractfile(file).read().decode('utf-8'))
reader = csv.reader(csv_file)
next(reader) # skip header
for row in reader:
print(row)
Here a io.StringIO object is used to make csv.reader happy. However, this might not scale well for larger files contained in the tar as each file is read in one single step.
This is a code:
import pandas as pd
import csv
with open('reviews.csv') as myFile:
reader = csv.reader(myFile)w
with open('bow.csv','a',newline="") as file:
handler= csv.writer(file)
for rowdata in reader:
handler.writerow({rowdata,'asd'})
Error is ValueError: I/O operation on closed file.
csv.reader() can only read from an open file. When you exit the first with block,myFile is automatically closed, so reader can't read from it any more.
You need to keep the input file open while you read from it.
import pandas as pd
import csv
with open('reviews.csv') as myFile:
reader = csv.reader(myFile)
with open('bow.csv','a',newline="") as file:
handler= csv.writer(file)
for rowdata in reader:
handler.writerow({rowdata,'asd'})
You can also open multiple files in a single with statement, so you don't need to nest them.
with open('reviews.csv') as myFile, open('bow.csv','a',newline="") as file:
reader = csv.reader(myFile)w
handler= csv.writer(file)
for rowdata in reader:
handler.writerow({rowdata,'asd'})
I have a complete dataset of tweets which i collect through Tweepy and save them as a json file. Now i want to Convert that data in csv file according to my need. Like only Text, Username, Created at and 4-5 more colums.
How can i do this can any one please provide me a python code for this. and another problem is that on converting the data in csv my tweet text is also split where any comma comes.
Please help us. I am a new in this field.
Thanks in Advance.
You would need to read your file in and convert each non-empty line from json format. You could then use itemgetter() to extract the required keys from the resulting dictionary and write the results to your output.csv file:
from operator import itemgetter
import csv
import json
header = ['text', 'username', 'created_at']
required_cols = itemgetter(*header)
with open('python1.json') as f_input, open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(header)
for row in f_input:
if row.strip():
csv_output.writerow(required_cols(json.loads(row)))
If you are using Python 3.x, use the following line:
with open('python1.json') as f_input, open('output.csv', 'w', newline='') as f_output:
I have a csv file and I want to transfer the raw data without the headers to a new csv file and have the rows and columns the same as the original.
IRIS_data = "IRIS_data.csv"
with open(IRIS_data, 'wb') as data:
wr = csv.writer(data, quoting=csv.QUOTE_ALL)
with open(IRIS) as f:
next(f)
for line in f:
wr.writerow(line)
The code above is my most recent attempt, when I try run it I get the following error:
a bytes-like object is required, not 'str'
It's because you opened the input file with with open(IRIS_data, 'wb'), which opens it in binary mode, and the output file with just with open(IRIS) which opens it in text mode.
In Python 3, you should open both files in text mode and specify newline='' option)—see the examples in the csv module's documentation)
To fix it, change them as follows:
with open(IRIS_data, 'w', newline='') as data:
and
with open(IRIS, newline='') as f:
However there are other issues with you code. Here's how to use those statements to get what I think you want:
import csv
IRIS = "IRIS.csv"
IRIS_data = "IRIS_data.csv"
with open(IRIS, 'r', newline='') as f, open(IRIS_data, 'w', newline='') as data:
next(f) # Skip over header in input file.
writer = csv.writer(data, quoting=csv.QUOTE_ALL)
writer.writerows(line.split() for line in f)
Contents of IRIS_data.csv file after running the script with your sample input data:
"6.4","2.8","5.6","2.2","2"
"5","2.3","3.3","1","1"
"4.9","2.5","4.5","1.7","2"
"4.9","3.1","1.5","0.1","0"
"5.7","3.8","1.7","0.3","0"
"4.4","3.2","1.3","0.2","0"
"5.4","3.4","1.5","0.4","0"
"6.9","3.1","5.1","2.3","2"
"6.7","3.1","4.4","1.4","1"
"5.1","3.7","1.5","0.4","0"
You have to encode the line you are writing like this:
wr.writerow( line.encode(”utf8”))
Also open your file using open(..., ‘wb’). This will open the file in binary mode. So you are certain the file is actually open in binary mode. Indeed it is better to now explicitly the encoding than assuming it. Enforcing encoding for both reading and writing will save you lots of trouble.
I am generating a number of csv files dynamically, using the following code:
import csv
fieldnames = ['foo1', 'foo2', 'foo3', 'foo4']
with open(csvfilepath, 'wb') as csvfile:
csvwrite = csv.DictWriter(csvfile, delimiter=',', fieldnames=fieldnames)
csvwrite.writeheader()
for row in data:
csvwrite.writerow(row)
To save space, I want to compress them.
Using the gzip module is quite easy:
with gzip.open("foo.gz", "w") as csvfile :
csvwrite = csv.DictWriter(csvfile, delimiter=',', fieldnames=fieldnames)
csvwrite.writeheader()
for row in data:
csvwrite.writerow(row)
But I want the file in 'zip' format.
I tried the zipfile module, but I am unable to directly write files into the zip archive.
Instead, I have to write the csv file to disk, compress them in a zip file using following code, and then delete the csv file.
with ZipFile(zipfilepath, 'w') as zipfile:
zipfile.write(csvfilepath, csvfilename, ZIP_DEFLATED)
How can I write a csv file directly to a compressed zip similar to gzip?
Use the cStringIO.StringIO object to imitate a file:
with ZipFile(your_zip_file, 'w', ZIP_DEFLATED) as zip_file:
string_buffer = StringIO()
writer = csv.writer(string_buffer)
# Write data using the writer object.
zip_file.writestr(filename + '.csv', string_buffer.getvalue())
Thanks kroolik
It's done with little modification.
with ZipFile(your_zip_file, 'w', ZIP_DEFLATED) as zip_file:
string_buffer = StringIO()
csvwriter = csv.DictWriter(string_buffer, delimiter=',', fieldnames=fieldnames)
csvwrite.writeheader()
for row in cdrdata:
csvwrite.writerow(row)
zip_file.writestr(filename + '.csv', string_buffer.getvalue())
Having IOString to store every bytes in memory could be very memory consuming.
Based on the zipfile module documentation after creating a ZipFile object, all individual files has to be opened. Like this:
with ZipFile('spam.zip') as myzip:
with myzip.open('eggs.txt') as myfile:
print(myfile.read())
This example can be used for write as well...