Add a column from a csv to another csv - python

I have 2 files named input.csv (composed of one column count ) and output.csv (composed of one column id).
I want to paste my count column in output.csv, just after the id column.
Here is my snippet :
with open ("/home/julien/input.csv", "r") as csvinput:
with open ("/home/julien/excel/output.csv", "a") as csvoutput:
writer = csv.writer(csvoutput, delimiter = ";")
for row in csv.reader(csvinput, delimiter = ";"):
if row[0] != "":
result = row[0]
else:
result = ""
row.append(result)
writer.writerow(row)
But it doesn't work.
I've been searching the problem for many hours but I'v got no solution. Would you have any tricks to solve my problem ?
Thanks! Julien

You need to work with three files, two for reading and one for writing.
This should work.
import csv
in_1_name = "/home/julien/input.csv"
in_2_name = "/home/julien/excel/output.csv"
out_name = "/home/julien/excel/merged.csv"
with open(in_1_name) as in_1, open(in_2_name) as in_2, open(out_name, 'w') as out:
reader1 = csv.reader(in_1, delimiter=";")
reader2 = csv.reader(in_2, delimiter=";")
writer = csv.writer(out, delimiter=";")
for row1, row2 in zip(reader1, reader2):
if row1[0] and row2[0]:
writer.writerow([row1[0], row2[0]])
You write the row for each column:
row.append(result)
writer.writerow(row)
Dedent the last line to write only once:
row.append(result)
writer.writerow(row)

Open both files for input.
Open a new file for output.
In a loop, read a line from each, formatting an output line, which is then written to the output file
close all the files
Programmatically copy your output file on top of the input file
"output.csv".
Done

If anyone was given two tables, merging them by using first column of each is very easy. With my library pyexcel, you do the merge just like merging tables:
>>> from pyexcel import Reader,Writer
>>> f1=Reader("input.csv", delimiter=';')
>>> f2=Reader("output.csv", delimiter=';')
>>> columns = [f1.column_at(0), f2.column_at(0)]
>>> f3=Writer("merged.csv", delimiter=';')
>>> f3.write_columns(columns)
>>> f3.close()

Related

Create multiple files from unique values of a column using inbuilt libraries of python

I started learning python and was wondering if there was a way to create multiple files from unique values of a column. I know there are 100's of ways of getting it done through pandas. But I am looking to have it done through inbuilt libraries. I couldn't find a single example where its done through inbuilt libraries.
Here is the sample csv file data:
uniquevalue|count
a|123
b|345
c|567
d|789
a|123
b|345
c|567
Sample output file:
a.csv
uniquevalue|count
a|123
a|123
b.csv
b|345
b|345
I am struggling with looping on unique values in a column and then print them out. Can someone explain with logic how to do it ? That will be much appreciated. Thanks.
import csv
from collections import defaultdict
header = []
data = defaultdict(list)
DELIMITER = "|"
with open("inputfile.csv", newline="") as csvfile:
reader = csv.reader(csvfile, delimiter=DELIMITER)
for i, row in enumerate(reader):
if i == 0:
header = row
else:
key = row[0]
data[key].append(row)
for key, value in data.items():
filename = f"{key}.csv"
with open(filename, "w", newline="") as f:
writer = csv.writer(f, delimiter=DELIMITER)
rows = [header] + value
writer.writerows(rows)
import csv
with open('sample.csv', newline='') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
with open(f"{row[0]}.csv", 'a') as inner:
writer = csv.writer(
inner, delimiter='|',
fieldnames=('uniquevalue', 'count')
)
writer.writerow(row)
the task can also be done without using csv module. the lines of the file are read, and with read_file.read().splitlines()[1:] the newline characters are stripped off, also skipping the header line of the csv file. with a set a unique collection of inputdata is created, that is used to count number of duplicates and to create the output files.
with open("unique_sample.csv", "r") as read_file:
items = read_file.read().splitlines()[1:]
for line in set(items):
with open(line[:line.index('|')] + '.csv', 'w') as output:
output.write((line + '\n') * items.count(line))

Add new dictionary values to an existing csv

I am trying to add 2 new columns to an existing file in the same program. The csv is generated by the previous function.
After looking at many answers here, I tried this, but it doesn't work because I couldn't find any answers using the csv dict writer in them, they were all about csv writer. This just creates a new file with these 2 columns in them. Can I get some help with this?
for me, sp in zip(meds, specs):
print(me.text, sp.text)
dict2 = {"Medicines": me.text, "Specialities": sp.text}
with open(f'Infusion_t{zip_add}.csv', 'r') as read, \
open(f'(Infusion_final{zip_add}.csv', 'a+', encoding='utf-8-sig', newline='') as f:
reader = csv.reader(read)
w = csv.DictWriter(f, dict2.keys())
for row in reader:
if not header_added:
w.writeheader()
header_added = True
row.append(w.writerow(dict2))
You need to append the new columns to row, then write row to the output file. You don't need the dictionary or DictWriter.
You can also open the output file just once before the loop, and write the header there, rather than each time through the main loop.
with open(f'(Infusion_final{zip_add}.csv', 'w', encoding='utf-8-sig', newline='') as f:
w = csv.writer(f)
w.writerow(['col1', 'col2', 'col3', ..., 'Medicines', 'Specalities']) # replace colX with the names of the original columns
for me, sp in zip(meds, specs):
print(me.text, sp.text)
with open(f'Infusion_t{zip_add}.csv', 'r') as read:
reader = csv.reader(read)
for row in reader:
row.append(me.text)
row.append(sp.text)
w.writerow(row)

Combine two csv into a single one with page breaks in python

So I'm trying to combine two different csv files into a single one and I've done that. The two csv files are of students in school who are present in 1 and absent in another.
I need to put the date the file was created at the top of the new csv and have each grade of the present students on a new page or after 3 blank rows.
Also on each new page or after each 3 blanks i want to have the name or the teacher, the date on which the file was created and the grade.
import csv
with open('inschool.csv', encoding="cp437") as f:
reader = csv.reader(f)
in_school = list(reader)
with open('notinschool.csv', encoding="cp437") as f:
reader = csv.reader(f)
not_in_school = list(reader)
for grade, name, status, hr_teacher in not_in_school:
print(grade, name, status, hr_teacher)
for grade, name, status, hr_teacher in in_school:
print(grade, name, status, hr_teacher)
iFile = open('inschool.csv', encoding="cp437")
reader = csv.reader(iFile)
IFILE = open('notinschool.csv', encoding="cp437")
READER = csv.reader(IFILE)
oFile = open('combined.csv','wt',encoding="cp437")
writer = csv.writer(oFile, delimiter='|', quoting=csv.QUOTE_ALL)
for row in READER:
writer.writerow(row)
writer.writerow("[]")
for row in reader:
writer.writerow(row)
writer.writerow("[]")
The code which i tried for the 3 blank rows had this ending but it gave 3 blank rows/lines after each students name instead of after each grade.
iFile = open('Inschool.csv',)
reader = csv.reader(iFile)
IFILE = open('notinschool.csv')
READER = csv.reader(IFILE)
oFile = open('combined.csv','wb')
writer_a = csv.writer(oFile, delimiter='|', quoting=csv.QUOTE_ALL)
writer_b = csv.writer(oFile, delimiter='|', quoting=csv.QUOTE_ALL, lineterminator="\n\n\n\n")
for row in READER:
writer_a.writerow(row)
writer_b.writerow([])
for row in reader:
writer_b.writerow(row)
I would appreciate it if someone could help me. Thanks.
You can do it really easy in the terminal. Just cd to the directory and do the command cat inschool.csv notinschool.csv > combined.csv
If you want to do it in Python I would do:
in_file1 = open("inschool.csv","r").read().split("\n")
in_file2 = open("notinschool.csv","r").read().split("\n")
out_file = open("combined.csv","w")
for line in in_file1:
if line:
out_file.write(line + "\n")
for line in in_file2:
if line:
out_file.write(line + "\n")
reading files the way above isn't the most efficient, but if they are small it doesnt really matter and it's easier to visualize what's happening. you can use your input file method with this b/c the concept stays the same :)
I just got into using this module called pandas and it is for DataFrames. They are much easier to use, process, navigate through, and merge than parsing text files.

append content of one csv file to another using python

I have 2 csv files:
output.csv
output1.csv
output.csv has a 5 columns of titles.
output1.csv has about 40 columns of different types of data.
I need to append all the content of output1.csv to output.csv. How can I do this?
could somebody please give me a hint on how to go about it ???
i have the following code :
reader=csv.DictReader(open("test.csv","r"))
allrows = list(reader)
keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)]
print keepcols
writer=csv.DictWriter(open("output.csv","w"),fieldnames='keepcols',extrasaction='ignore')
writer.writerows(allrows)
with open("test1.csv","r") as f:
fields=next(f).split()
# print(fields)
allrows=[]
for line in f:
line=line.split()
row=dict(zip(fields,line))
allrows.append(row)
# print(row)
keepcols = [c for c in fields if any(row[c] != '0' for row in allrows)]
print keepcols
writer=csv.DictWriter(open("output1.csv","w"),fieldnames=keepcols,extrasaction='ignore')
writer.writerows(allrows)
test.csv generates output.csv
test1.csv generates output1.csv
i m trying to see if i can make both files generate my output in the same file..
If I understand your question correctly, you want to create a csv with 41 columns - the 1 from output.csv followed by the 40 from output1.csv.
I assume they have the same number of rows (if not - what is the necessary behavior?)
Try using the csv module:
import csv
reader = csv.reader(open('output.csv', 'rb'))
reader1 = csv.reader(open('output1.csv', 'rb'))
writer = csv.writer(open('appended_output.csv', 'wb'))
for row in reader:
row1 = reader1.next()
writer.writerow(row + row1)
If your csv files are formatted with special delimiters or quoting characters, you can use the optional keyword arguments for the csv.reader and csv.writer objects.
See Python's csv module documentation for details...
EDIT: Added 'b' flag, as suggested.
This recent discussion looks very similar to what you are looking for except that the OP there wanted to concatenate mp3 files.
EDIT:
import os, sys
target = '/path/to/target'
src1 = '/path/to/source1.csv'
src2 = '/path/to/source2.csv'
tf = open(target, 'a')
tf.write(open(src1).read())
tf.write(open(src2).read())
tf.close()
try this, this should work since you simply want to do the equivalent of cat src1 src2 > target of shell command
"I need to append all the content of output1.csv to output.csv." ... taken literally that would mean write each row in the first file followed by each row in the second file. Is that what you want??
titles of what? the 40 columns in the other file?? If this is so, then assuming that you want the titles written as a row of column headings:
import csv
titles = [x[0] for x in csv.reader(open('titles.csv', 'rb'))]
writer = csv.writer(open('merged.csv', 'wb'))
writer.writerow(titles)
for row in csv.reader(open('data.csv', 'rb')):
writer.writerow(row)
You could also use a generator from the reader if you want to pass a condition:
import csv
def read_generator(filepath:str):
with open(filepath, 'rb'):
reader = csv.reader(f)
for row in reader:
if row[0] == condition:
yield row
and then write from that with:
writer = csv.writer(open("process.csv", "rb"))
write.writerow(read_generator(file_to_read.csv))

csv python questions

i am opening a csv file like this:
import csv
reader = csv.reader(open("book1.csv", "rb"))
for row in reader:
print row
how can i replace the value in column 3 with its log and then save the result into a new csv?
Like this?
>>> input = "1,2,3\n4,5,6\n7,8,9".splitlines()
>>> reader=csv.reader(input)
>>> for row in reader:
... row[2] = log(float(row[2]))
... print ','.join(map(str,row))
...
1,2,1.09861228867
4,5,1.79175946923
7,8,2.19722457734
These links might help:
http://docs.python.org/library/csv.html#csv.writer
http://docs.python.org/tutorial/datastructures.html?highlight=array
Each row being returned by reader is an array. Arrays in Python are 0 based (So to access the third entry in a row, you would use my_array[2])
That should help you on your way.
You should use the context manager WITH statement for files - cleaner, less code, obviates file.close() statements.
e.g.
import csv
import math
with open('book1.csv', 'rb') as f1,open('book2.csv', 'wb') as f2:
reader = csv.reader(f1)
writer = csv.writer(f2)
for row in reader:
row[2] = str(math.log(float(row[2])))
writer.writerow(row)

Categories