csv python questions - python

i am opening a csv file like this:
import csv
reader = csv.reader(open("book1.csv", "rb"))
for row in reader:
print row
how can i replace the value in column 3 with its log and then save the result into a new csv?

Like this?
>>> input = "1,2,3\n4,5,6\n7,8,9".splitlines()
>>> reader=csv.reader(input)
>>> for row in reader:
... row[2] = log(float(row[2]))
... print ','.join(map(str,row))
...
1,2,1.09861228867
4,5,1.79175946923
7,8,2.19722457734

These links might help:
http://docs.python.org/library/csv.html#csv.writer
http://docs.python.org/tutorial/datastructures.html?highlight=array
Each row being returned by reader is an array. Arrays in Python are 0 based (So to access the third entry in a row, you would use my_array[2])
That should help you on your way.

You should use the context manager WITH statement for files - cleaner, less code, obviates file.close() statements.
e.g.
import csv
import math
with open('book1.csv', 'rb') as f1,open('book2.csv', 'wb') as f2:
reader = csv.reader(f1)
writer = csv.writer(f2)
for row in reader:
row[2] = str(math.log(float(row[2])))
writer.writerow(row)

Related

Copy specific rows from csv to csv in Python 2.7

So far I have been trying to copy specific rows including headers from original csv file to a new one. However, once I run my code it was copying a total mess creating a huge document.
This is one of the options I have tried so far, which seems to be the closest to the solution:
import csv
with open('D:/test.csv', 'r') as f,open('D:/out.csv', 'w') as f_out:
reader = csv.DictReader(f)
writer = csv.writer(f_out)
for row in reader:
if row["ICLEVEL"] == "1":
writer.writerow(row)
The thing is that I have to copy only those rows where value of "ICLEVEL"(Header name) is equal to "1".
Note: test.csv is very huge file and I cannot hardcode all header names in the writer.
Any demostration of pythonic way of doing this is greatly appreciated. Thanks.
writer.writerow expects a sequence (a tuple or list). You can use DictWriter which expects a dict.
import csv
with open('D:/test.csv', 'r') as f, open('D:/out.csv', 'w') as f_out:
reader = csv.DictReader(f)
writer = csv.DictWriter(f_out, fieldnames=reader.fieldnames)
writer.writeheader() # For writing header
for row in reader:
if row['ICLEVEL'] == '1':
writer.writerow(row)
Your row is a dictionary. CSV writer cannot write dictionaries. Select the values from the dictionary and write just them:
writer.writerow(reader.fieldnames)
for row in reader:
if row["ICLEVEL"] == "1":
values = [row[field] for field in reader.fieldnames]
writer.writerow(values)
I would actually use Pandas, not a CSV reader:
import pandas as pd
df=pd.read_csv("D:/test.csv")
newdf = df[df["ICLEVEL"]==1]
newdf.to_csv("D:/out.csv",index=False)
The code is much more compact.

How to read one single line of csv data in Python?

There is a lot of examples of reading csv data using python, like this one:
import csv
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
print(row)
I only want to read one line of data and enter it into various variables. How do I do that? I've looked everywhere for a working example.
My code only retrieves the value for i, and none of the other values
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
i = int(row[0])
a1 = int(row[1])
b1 = int(row[2])
c1 = int(row[2])
x1 = int(row[2])
y1 = int(row[2])
z1 = int(row[2])
To read only the first row of the csv file use next() on the reader object.
with open('some.csv', newline='') as f:
reader = csv.reader(f)
row1 = next(reader) # gets the first line
# now do something here
# if first row is the header, then you can do one more next() to get the next row:
# row2 = next(f)
or :
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
# do something here with `row`
break
you could get just the first row like:
with open('some.csv', newline='') as f:
csv_reader = csv.reader(f)
csv_headings = next(csv_reader)
first_line = next(csv_reader)
You can use Pandas library to read the first few lines from the huge dataset.
import pandas as pd
data = pd.read_csv("names.csv", nrows=1)
You can mention the number of lines to be read in the nrows parameter.
Just for reference, a for loop can be used after getting the first row to get the rest of the file:
with open('file.csv', newline='') as f:
reader = csv.reader(f)
row1 = next(reader) # gets the first line
for row in reader:
print(row) # prints rows 2 and onward
From the Python documentation:
And while the module doesn’t directly support parsing strings, it can easily be done:
import csv
for row in csv.reader(['one,two,three']):
print row
Just drop your string data into a singleton list.
The simple way to get any row in csv file
import csv
csvfile = open('some.csv','rb')
csvFileArray = []
for row in csv.reader(csvfile, delimiter = '.'):
csvFileArray.append(row)
print(csvFileArray[0])
To print a range of line, in this case from line 4 to 7
import csv
with open('california_housing_test.csv') as csv_file:
data = csv.reader(csv_file)
for row in list(data)[4:7]:
print(row)
I think the simplest way is the best way, and in this case (and in most others) is one without using external libraries (pandas) or modules (csv). So, here is the simple answer.
""" no need to give any mode, keep it simple """
with open('some.csv') as f:
""" store in a variable to be used later """
my_line = f.nextline()
""" do what you like with 'my_line' now """

Add a column from a csv to another csv

I have 2 files named input.csv (composed of one column count ) and output.csv (composed of one column id).
I want to paste my count column in output.csv, just after the id column.
Here is my snippet :
with open ("/home/julien/input.csv", "r") as csvinput:
with open ("/home/julien/excel/output.csv", "a") as csvoutput:
writer = csv.writer(csvoutput, delimiter = ";")
for row in csv.reader(csvinput, delimiter = ";"):
if row[0] != "":
result = row[0]
else:
result = ""
row.append(result)
writer.writerow(row)
But it doesn't work.
I've been searching the problem for many hours but I'v got no solution. Would you have any tricks to solve my problem ?
Thanks! Julien
You need to work with three files, two for reading and one for writing.
This should work.
import csv
in_1_name = "/home/julien/input.csv"
in_2_name = "/home/julien/excel/output.csv"
out_name = "/home/julien/excel/merged.csv"
with open(in_1_name) as in_1, open(in_2_name) as in_2, open(out_name, 'w') as out:
reader1 = csv.reader(in_1, delimiter=";")
reader2 = csv.reader(in_2, delimiter=";")
writer = csv.writer(out, delimiter=";")
for row1, row2 in zip(reader1, reader2):
if row1[0] and row2[0]:
writer.writerow([row1[0], row2[0]])
You write the row for each column:
row.append(result)
writer.writerow(row)
Dedent the last line to write only once:
row.append(result)
writer.writerow(row)
Open both files for input.
Open a new file for output.
In a loop, read a line from each, formatting an output line, which is then written to the output file
close all the files
Programmatically copy your output file on top of the input file
"output.csv".
Done
If anyone was given two tables, merging them by using first column of each is very easy. With my library pyexcel, you do the merge just like merging tables:
>>> from pyexcel import Reader,Writer
>>> f1=Reader("input.csv", delimiter=';')
>>> f2=Reader("output.csv", delimiter=';')
>>> columns = [f1.column_at(0), f2.column_at(0)]
>>> f3=Writer("merged.csv", delimiter=';')
>>> f3.write_columns(columns)
>>> f3.close()

Python to insert quotes to column in CSV

I have no knowledge of python.
What i want to be able to do is create a script that will edit a CSV file so that it will wrap every field in column 3 around quotes. I haven't been able to find much help, is this quick and easy to do? Thanks.
column1,column2,column3
1111111,2222222,333333
This is a fairly crude solution, very specific to your request (assuming your source file is called "csvfile.csv" and is in C:\Temp).
import csv
newrow = []
csvFileRead = open('c:/temp/csvfile.csv', 'rb')
csvFileNew = open('c:/temp/csvfilenew.csv', 'wb')
# Open the CSV
csvReader = csv.reader(csvFileRead, delimiter = ',')
# Append the rows to variable newrow
for row in csvReader:
newrow.append(row)
# Add quotes around the third list item
for row in newrow:
row[2] = "'"+str(row[2])+"'"
csvFileRead.close()
# Create a new CSV file
csvWriter = csv.writer(csvFileNew, delimiter = ',')
# Append the csv with rows from newrow variable
for row in newrow:
csvWriter.writerow(row)
csvFileNew.close()
There are MUCH more elegant ways of doing what you want, but I've tried to break it down into basic chunks to show how each bit works.
I would start by looking at the csv module.
import csv
filename = 'file.csv'
with open(filename, 'wb') as f:
reader = csv.reader(f)
for row in reader:
row[2] = "'%s'" % row[2]
And then write it back in the csv file.

How can I get a specific field of a csv file?

I need a way to get a specific item(field) of a CSV. Say I have a CSV with 100 rows and 2 columns (comma seperated). First column emails, second column passwords. For example I want to get the password of the email in row 38. So I need only the item from 2nd column row 38...
Say I have a csv file:
aaaaa#aaa.com,bbbbb
ccccc#ccc.com,ddddd
How can I get only 'ddddd' for example?
I'm new to the language and tried some stuff with the csv module, but I don't get it...
import csv
mycsv = csv.reader(open(myfilepath))
for row in mycsv:
text = row[1]
Following the comments to the SO question here, a best, more robust code would be:
import csv
with open(myfilepath, 'rb') as f:
mycsv = csv.reader(f)
for row in mycsv:
text = row[1]
............
Update: If what the OP actually wants is the last string in the last row of the csv file, there are several aproaches that not necesarily needs csv. For example,
fulltxt = open(mifilepath, 'rb').read()
laststring = fulltxt.split(',')[-1]
This is not good for very big files because you load the complete text in memory but could be ok for small files. Note that laststring could include a newline character so strip it before use.
And finally if what the OP wants is the second string in line n (for n=2):
Update 2: This is now the same code than the one in the answer from J.F.Sebastian. (The credit is for him):
import csv
line_number = 2
with open(myfilepath, 'rb') as f:
mycsv = csv.reader(f)
mycsv = list(mycsv)
text = mycsv[line_number][1]
............
#!/usr/bin/env python
"""Print a field specified by row, column numbers from given csv file.
USAGE:
%prog csv_filename row_number column_number
"""
import csv
import sys
filename = sys.argv[1]
row_number, column_number = [int(arg, 10)-1 for arg in sys.argv[2:])]
with open(filename, 'rb') as f:
rows = list(csv.reader(f))
print rows[row_number][column_number]
Example
$ python print-csv-field.py input.csv 2 2
ddddd
Note: list(csv.reader(f)) loads the whole file in memory. To avoid that you could use itertools:
import itertools
# ...
with open(filename, 'rb') as f:
row = next(itertools.islice(csv.reader(f), row_number, row_number+1))
print row[column_number]
import csv
def read_cell(x, y):
with open('file.csv', 'r') as f:
reader = csv.reader(f)
y_count = 0
for n in reader:
if y_count == y:
cell = n[x]
return cell
y_count += 1
print (read_cell(4, 8))
This example prints cell 4, 8 in Python 3.
There is an interesting point you need to catch about csv.reader() object. The csv.reader object is not list type, and not subscriptable.
This works:
for r in csv.reader(file_obj): # file not closed
print r
This does not:
r = csv.reader(file_obj)
print r[0]
So, you first have to convert to list type in order to make the above code work.
r = list( csv.reader(file_obj) )
print r[0]
Finaly I got it!!!
import csv
def select_index(index):
csv_file = open('oscar_age_female.csv', 'r')
csv_reader = csv.DictReader(csv_file)
for line in csv_reader:
l = line['Index']
if l == index:
print(line[' "Name"'])
select_index('11')
"Bette Davis"
Following may be be what you are looking for:
import pandas as pd
df = pd.read_csv("table.csv")
print(df["Password"][row_number])
#where row_number is 38 maybe
import csv
inf = csv.reader(open('yourfile.csv','r'))
for row in inf:
print row[1]

Categories