breaking csv file in two files with python - python

This is my first project using python and I'm not that great at programming. I have a csv file with two tables in it.
table 1 title
row1
row2
...
blank row
blank row
table 2 title
row1
row2
...
Here is my code
import csv
csv_file = open('usagebased.csv')
csv_reader = csv.reader(csv_file, delimiter=',')
next(csv_reader)
So I want to split the file in two csv files. What is the best way to do it? Can i split the file based on title 2 or the blank rows?
Thanks!

The function csv.reader can accept any object that conforms the iterator protocol and outputs a string from next(). Knowing that, you can actually split you csv file in two lists by the blank rows. After that you can feed two csv.reader with both lists.
import csv
two_tables = open('usagebased.csv').read().split("\n\n\n")
# Feed first csv.reader
first_csv = csv.reader(two_tables[0], delimiter=',')
# Feed second csv.reader
second_csv = csv.reader(two_tables[1], delimiter=',')

Thanks to both of you I succeeded. Thank you very much.
f = open('usagebased.csv').read().split("\n\n\n")
f1 = f[0]
f2 = f[1]
file1 = open('test1.csv','w')
file2 = open('test2.csv','w')
file1.write(f1)
file2.write(f2)

in case that we don't know number of new lines , this can help
,i tried to write code as simple as possible :
since you don't do anything to csv things inside file you dont need to use csv library
FzListe
7MA1, 7OS1
7MA1, 7ZJB
7MA2, 7MA3, 7OS1
76G1, 7MA1, 7OS1
7MA1, 7OS1
71E5, 71E6, 7MA1, FSS1
here the code :
f= open('test.txt','rt')
while True:
name = 0
for s in f:
if not s=='\n':
with open(str(name),'at') as ff:
ff.write(s)
else:
while s =='\n':
s = next(f)
name +=1
with open(str(name),'at') as ff:
ff.write(s)

Related

How to replace one value with another in a csv file?

I have a CSV file with information and want to replace the information in a specific location with a new value.
For example if my CSV file looks like this:
example1,example2,0
example3,example4,0
exampple5,example6,0
Note that each row is labelled for example:
test = row[0]
test1 = row[1]
test2 = row[2]
If I want to replace
test[0]
with a new value how would I go about doing it?
Simplest way without installing any additional package would be to use built-in csv to read the whole file in a matrix and replace the desired element.
Here is code that would do just that:
import csv
with open('test.csv', 'r') as in_file, open('test_out.csv', 'wb') as out_file:
data = [row for row in csv.reader(in_file)]
data[0][0] = 'new value'
writer = csv.writer(out_file)
writer.writerows(data)
There are a handful of ways to do this, but personally I'm a big fan of pandas. With pandas, you can read a csv file with df = pd.read_csv('path_to_file.csv'). Make changes however you want, if you wanted row 1 column 1, you'd use df.loc[0,0] = new_val. Then when you are done save to the same file df.to_csv('path_to_file.csv').

Copying one column of a CSV file and adding it to another file using python

I have two files, the first one is called book1.csv, and looks like this:
header1,header2,header3,header4,header5
1,2,3,4,5
1,2,3,4,5
1,2,3,4,5
The second file is called book2.csv, and looks like this:
header1,header2,header3,header4,header5
1,2,3,4
1,2,3,4
1,2,3,4
My goal is to copy the column that contains the 5's in book1.csv to the corresponding column in book2.csv.
The problem with my code seems to be that it is not appending right nor is it selecting just the index that I want to copy.It also gives an error that I have selected an incorrect index position. The output is as follows:
header1,header2,header3,header4,header5
1,2,3,4
1,2,3,4
1,2,3,41,2,3,4,5
Here is my code:
import csv
with open('C:/Users/SAM/Desktop/book2.csv','a') as csvout:
write=csv.writer(csvout, delimiter=',')
with open('C:/Users/SAM/Desktop/book1.csv','rb') as csvfile1:
read=csv.reader(csvfile1, delimiter=',')
header=next(read)
for row in read:
row[5]=write.writerow(row)
What should I do to get this to append properly?
Thanks for any help!
What about something like this. I read in both books, append the last element of book1 to the book2 row for every row in book2, which I store in a list. Then I write the contents of that list to a new .csv file.
with open('book1.csv', 'r') as book1:
with open('book2.csv', 'r') as book2:
reader1 = csv.reader(book1, delimiter=',')
reader2 = csv.reader(book2, delimiter=',')
both = []
fields = reader1.next() # read header row
reader2.next() # read and ignore header row
for row1, row2 in zip(reader1, reader2):
row2.append(row1[-1])
both.append(row2)
with open('output.csv', 'w') as output:
writer = csv.writer(output, delimiter=',')
writer.writerow(fields) # write a header row
writer.writerows(both)
Although some of the code above will work it is not really scalable and a vectorised approach is needed. Getting to work with numpy or pandas will make some of these tasks easier so it is great to learn a bit of it.
You can download pandas from the Pandas Website
# Load Pandas
from pandas import DataFrame
# Load each file into a pandas dataframe, this is based on a numpy array
data1 = DataFrame.from_csv('csv1.csv',sep=',',parse_dates=False)
data2 = DataFrame.from_csv('csv2.csv',sep=',',parse_dates=False)
#Now add 'header5' from data1 to data2
data2['header5'] = data1['header5']
#Save it back to csv
data2.to_csv('output.csv')
Regarding the "error that I have selected an incorrect index position," I suspect this is because you're using row[5] in your code. Indexing in Python starts from 0, so if you have A = [1, 2, 3, 4, 5] then to get the 5 you would do print(A[4]).
Assuming the two files have the same number of rows and the rows are in the same order, I think you want to do something like this:
import csv
# Open the two input files, which I've renamed to be more descriptive,
# and also an output file that we'll be creating
with open("four_col.csv", mode='r') as four_col, \
open("five_col.csv", mode='r') as five_col, \
open("five_output.csv", mode='w', newline='') as outfile:
four_reader = csv.reader(four_col)
five_reader = csv.reader(five_col)
five_writer = csv.writer(outfile)
_ = next(four_reader) # Ignore headers for the 4-column file
headers = next(five_reader)
five_writer.writerow(headers)
for four_row, five_row in zip(four_reader, five_reader):
last_col = five_row[-1] # # Or use five_row[4]
four_row.append(last_col)
five_writer.writerow(four_row)
Why not reading the files line by line and use the -1 index to find the last item?
endings=[]
with open('book1.csv') as book1:
for line in book1:
# if not header line:
endings.append(line.split(',')[-1])
linecounter=0
with open('book2.csv') as book2:
for line in book2:
# if not header line:
print line+','+str(endings[linecounter]) # or write to file
linecounter+=1
You should also catch errors if row numbers don't match.

Reading in Excel file with corrupt data using PYTHON

I am trying to read in a table from a .CSV file which should have 5 columns.
But, some rows have corrupt data..making it more than 5 columns.
How do I reject those rows and continue reading further ?
*Using
temp = read_table(folder + r'\temp.txt, sep=r'\t')
Just gives an error and stops the program*
I am new to Python...please help
Thanks
Look into using Python's csv module.
Without testing the damaged file it is difficult to say if this will do the trick however the csvreader reads a csv file's rows as a list of strings so you could potentially check if the list has 5 elements and proceed that way.
A code example:
out = []
with open('file.csv', 'rb') as csvfile:
reader = csv.reader(csvfile, delimeter=' ')
for row in reader:
if len(row) == 5:
out.append(row)

Python to insert quotes to column in CSV

I have no knowledge of python.
What i want to be able to do is create a script that will edit a CSV file so that it will wrap every field in column 3 around quotes. I haven't been able to find much help, is this quick and easy to do? Thanks.
column1,column2,column3
1111111,2222222,333333
This is a fairly crude solution, very specific to your request (assuming your source file is called "csvfile.csv" and is in C:\Temp).
import csv
newrow = []
csvFileRead = open('c:/temp/csvfile.csv', 'rb')
csvFileNew = open('c:/temp/csvfilenew.csv', 'wb')
# Open the CSV
csvReader = csv.reader(csvFileRead, delimiter = ',')
# Append the rows to variable newrow
for row in csvReader:
newrow.append(row)
# Add quotes around the third list item
for row in newrow:
row[2] = "'"+str(row[2])+"'"
csvFileRead.close()
# Create a new CSV file
csvWriter = csv.writer(csvFileNew, delimiter = ',')
# Append the csv with rows from newrow variable
for row in newrow:
csvWriter.writerow(row)
csvFileNew.close()
There are MUCH more elegant ways of doing what you want, but I've tried to break it down into basic chunks to show how each bit works.
I would start by looking at the csv module.
import csv
filename = 'file.csv'
with open(filename, 'wb') as f:
reader = csv.reader(f)
for row in reader:
row[2] = "'%s'" % row[2]
And then write it back in the csv file.

append content of one csv file to another using python

I have 2 csv files:
output.csv
output1.csv
output.csv has a 5 columns of titles.
output1.csv has about 40 columns of different types of data.
I need to append all the content of output1.csv to output.csv. How can I do this?
could somebody please give me a hint on how to go about it ???
i have the following code :
reader=csv.DictReader(open("test.csv","r"))
allrows = list(reader)
keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)]
print keepcols
writer=csv.DictWriter(open("output.csv","w"),fieldnames='keepcols',extrasaction='ignore')
writer.writerows(allrows)
with open("test1.csv","r") as f:
fields=next(f).split()
# print(fields)
allrows=[]
for line in f:
line=line.split()
row=dict(zip(fields,line))
allrows.append(row)
# print(row)
keepcols = [c for c in fields if any(row[c] != '0' for row in allrows)]
print keepcols
writer=csv.DictWriter(open("output1.csv","w"),fieldnames=keepcols,extrasaction='ignore')
writer.writerows(allrows)
test.csv generates output.csv
test1.csv generates output1.csv
i m trying to see if i can make both files generate my output in the same file..
If I understand your question correctly, you want to create a csv with 41 columns - the 1 from output.csv followed by the 40 from output1.csv.
I assume they have the same number of rows (if not - what is the necessary behavior?)
Try using the csv module:
import csv
reader = csv.reader(open('output.csv', 'rb'))
reader1 = csv.reader(open('output1.csv', 'rb'))
writer = csv.writer(open('appended_output.csv', 'wb'))
for row in reader:
row1 = reader1.next()
writer.writerow(row + row1)
If your csv files are formatted with special delimiters or quoting characters, you can use the optional keyword arguments for the csv.reader and csv.writer objects.
See Python's csv module documentation for details...
EDIT: Added 'b' flag, as suggested.
This recent discussion looks very similar to what you are looking for except that the OP there wanted to concatenate mp3 files.
EDIT:
import os, sys
target = '/path/to/target'
src1 = '/path/to/source1.csv'
src2 = '/path/to/source2.csv'
tf = open(target, 'a')
tf.write(open(src1).read())
tf.write(open(src2).read())
tf.close()
try this, this should work since you simply want to do the equivalent of cat src1 src2 > target of shell command
"I need to append all the content of output1.csv to output.csv." ... taken literally that would mean write each row in the first file followed by each row in the second file. Is that what you want??
titles of what? the 40 columns in the other file?? If this is so, then assuming that you want the titles written as a row of column headings:
import csv
titles = [x[0] for x in csv.reader(open('titles.csv', 'rb'))]
writer = csv.writer(open('merged.csv', 'wb'))
writer.writerow(titles)
for row in csv.reader(open('data.csv', 'rb')):
writer.writerow(row)
You could also use a generator from the reader if you want to pass a condition:
import csv
def read_generator(filepath:str):
with open(filepath, 'rb'):
reader = csv.reader(f)
for row in reader:
if row[0] == condition:
yield row
and then write from that with:
writer = csv.writer(open("process.csv", "rb"))
write.writerow(read_generator(file_to_read.csv))

Categories