I have two csv files result.csv and sample.csv.
result.csv
M11251TH1230
M11543TH4292
M11435TDS144
sample.csv
M11435TDS144,STB#1,Router#1
M11543TH4292,STB#2,Router#1
M11509TD9937,STB#3,Router#1
M11543TH4258,STB#4,Router#1
I have a python script which will compare both the files if line in result.csv matches with the first word in the line in sample.csv, then append 1 else append 0 at every line in sample.csv
It should look like M11435TDS144,STB#1,Router#1,1 and M11543TH4258,STB#4,Router#1,0 since M11543TH4258 is not found in result.csv
script.py
import csv
with open('result.csv', 'rb') as f:
reader = csv.reader(f)
result_list = []
for row in reader:
result_list.extend(row)
with open('sample.csv', 'rb') as f:
reader = csv.reader(f)
sample_list = []
for row in reader:
if row[0] in result_list:
sample_list.append(row + [1])
else:
sample_list.append(row + [0])
with open('sample.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerows(sample_list)
sample output(sample.csv)if I run the script two times
M11435TDS144,STB#1,Router#1,1,1
M11543TH4292,STB#2,Router#1,1,1
M11509TD9937,STB#3,Router#1,0,0
M11543TH4258,STB#4,Router#1,0,0
Every time I run the script, 1's and 0's are being appended in a new column sample.csv. Is there any way every time I run the script, I can replace the appended column instead of increasing columns.
you write to the sample.csv and then you use it as input file, with the additional column. That's why you have more and more 1's and 0's in this file.
Regards, Grzegorz
Related
I am having to add couple of lists in python as columns to an existing CSV file. I want to make use of a temporary file for the output CSV because I want to sort first 2 columns of that resulting data and then write to a new final CSV file. I don't want to keep the unsorted csv file which is why I am trying to use tempfile.NamedTemporaryFile for that step. It's giving nothing in the final CSV file but no other code errors. I changed how the with blocks are indented but unable to fix it. I tested by using a file on disk which works fine. I need help understanding what I am doing wrong. Here is my code:
# Open the existing csv in read mode and new temporary csv in write mode
with open(csvfile.name, 'r') as read_f, \
tempfile.NamedTemporaryFile(suffix='.csv', prefix=('inter'), mode='w', delete=False) as write_f:
csv_reader = csv.reader(read_f)
csv_writer = csv.writer(write_f)
i = 0
for row in csv_reader:
# Append the new list values to that row/list
row.append(company_list[i])
row.append(highest_percentage[i])
# Add the updated row / list to the output file
csv_writer.writerow(row)
i += 1
with open(write_f.name) as data:
stuff = csv.reader(data)
sortedlist = sorted(stuff, key=operator.itemgetter(0, 1))
#now write the sorted result into final CSV file
with open(fileout, 'w', newline='') as f:
fileWriter = csv.writer(f)
for row in sortedlist:
fileWriter.writerow(row)
You should insert a write_f.seek(0, 0)
Just before the line opening the temporary file:
write_f.seek(0, 0)
with open(write_f.name) as data:
I found out what was causing the IndexError and consequently the empty final CSV. I resolved it with the help of this: CSV file written with Python has blank lines between each row. Here's my changed code that worked as desired:
with open(csvfile.name, 'r') as read_f, \
tempfile.NamedTemporaryFile(suffix='.csv', prefix=('inter'), newline='', mode='w+', delete=False) as write_f:
csv_reader = csv.reader(read_f)
csv_writer = csv.writer(write_f)
i = 0
for row in csv_reader:
# Append the new list values to that row/list
row.append(company_list[i])
row.append(highest_percentage[i])
# Add the updated row / list to the output file
csv_writer.writerow(row)
i += 1
with open(write_f.name) as read_stuff, \
open(fileout, 'w', newline='') as write_stuff:
read_data = csv.reader(read_stuff)
write_data = csv.writer(write_stuff)
sortedlist = sorted(read_data, key=operator.itemgetter(0, 1))
for row in sortedlist:
write_data.writerow(row)
I'm writing a program that runs over a csv file and need to check if one of the lines in the csv file equals to the string iv'e decided but it is not working.
import csv
f= open('myfile.csv')
csv_f = csv.reader(f)
x = 'www.google.com'
for row in csv_f:
if row[index] == x :
print "a"
else:
print row
What is index? You want to check first value for equality, or iterate over each value in row? PS. You should close file at the end, or, better, use with statement.
with open(filename) as f:
csv_file = csv.reader(f)
for row in csv_file:
...
I am new at handling csv files with python and I want to write code that allows me to do the following: I have a pattern as:
pattern="3-5;7;10-16"(which may vary)
and I want to delete (in that case) rows 3 to 5 , 7 and 10 to 16
does any one have an idea how to do that?
You cannot simply delete lines from a csv. Instead, you have to read it in and then write it back with the accepted values. The following code works:
import csv
pattern="3-5;7;10-16"
off = []
for i in pattern.split(';'):
if '-' in i:
off += range(int(i.split('-')[0]),int(i.split('-')[1])+1)
else:
off += [int(i)]
with open('test.txt') as f:
reader = csv.reader(f)
reader = [','.join(item) for i,item in enumerate(reader) if i+1 not in off]
print reader
with open('input.txt', 'w') as f2:
for i in reader:
f2.write(i+'\n')
I'm new to python and I struggling with this code. Have 2 file, 1st file is text file containing email addresses (one each line), 2nd file is csv file with 5-6 columns. Script should take search input from file1 and search in file 2, the output should be stored in another csv file (only first 3 columns) see example below. Also I have copied a script that I was working on. If there is a better/efficient script then please let me know. Thank you, appreciate your help.
File1 (output.txt)
rrr#company.com
eee#company.com
ccc#company.com
File2 (final.csv)
Sam,Smith,sss#company.com,admin
Eric,Smith,eee#company.com,finance
Joe,Doe,jjj#company.com,telcom
Chase,Li,ccc#company.com,IT
output (out_name_email.csv)
Eric,Smith,eee#company.com
Chase,Li,ccc#company.com
Here is the script
import csv
outputfile = 'C:\\Python27\\scripts\\out_name_email.csv'
inputfile = 'C:\\Python27\\scripts\\output.txt'
datafile = 'C:\\Python27\\scripts\\final.csv'
names=[]
with open(inputfile) as f:
for line in f:
names.append(line)
with open(datafile, 'rb') as fd, open(outputfile, 'wb') as fp_out1:
writer = csv.writer(fp_out1, delimiter=",")
reader = csv.reader(fd, delimiter=",")
headers = next(reader)
for row in fd:
for name in names:
if name in line:
writer.writerow(row)
Load the emails into a set for O(1) lookup:
with open(inputfile) as fin:
emails = set(line.strip() for line in fin)
Then loop over the rows once, and check it exists in emails - no need to loop over each possible match for each row:
# ...
for row in reader:
if row[1] in emails:
writer.writerow(row)
If you're not doing anything else, then you can make it:
writer.writerows(row for row in reader if row[1] in emails)
A couple of notes, in your original code you're not using the csv.reader object reader - you're looping over fd and you appear to have some naming issues with names and line and row...
I have 2 files named input.csv (composed of one column count ) and output.csv (composed of one column id).
I want to paste my count column in output.csv, just after the id column.
Here is my snippet :
with open ("/home/julien/input.csv", "r") as csvinput:
with open ("/home/julien/excel/output.csv", "a") as csvoutput:
writer = csv.writer(csvoutput, delimiter = ";")
for row in csv.reader(csvinput, delimiter = ";"):
if row[0] != "":
result = row[0]
else:
result = ""
row.append(result)
writer.writerow(row)
But it doesn't work.
I've been searching the problem for many hours but I'v got no solution. Would you have any tricks to solve my problem ?
Thanks! Julien
You need to work with three files, two for reading and one for writing.
This should work.
import csv
in_1_name = "/home/julien/input.csv"
in_2_name = "/home/julien/excel/output.csv"
out_name = "/home/julien/excel/merged.csv"
with open(in_1_name) as in_1, open(in_2_name) as in_2, open(out_name, 'w') as out:
reader1 = csv.reader(in_1, delimiter=";")
reader2 = csv.reader(in_2, delimiter=";")
writer = csv.writer(out, delimiter=";")
for row1, row2 in zip(reader1, reader2):
if row1[0] and row2[0]:
writer.writerow([row1[0], row2[0]])
You write the row for each column:
row.append(result)
writer.writerow(row)
Dedent the last line to write only once:
row.append(result)
writer.writerow(row)
Open both files for input.
Open a new file for output.
In a loop, read a line from each, formatting an output line, which is then written to the output file
close all the files
Programmatically copy your output file on top of the input file
"output.csv".
Done
If anyone was given two tables, merging them by using first column of each is very easy. With my library pyexcel, you do the merge just like merging tables:
>>> from pyexcel import Reader,Writer
>>> f1=Reader("input.csv", delimiter=';')
>>> f2=Reader("output.csv", delimiter=';')
>>> columns = [f1.column_at(0), f2.column_at(0)]
>>> f3=Writer("merged.csv", delimiter=';')
>>> f3.write_columns(columns)
>>> f3.close()