Python search csv file from input text file - python

I'm new to python and I struggling with this code. Have 2 file, 1st file is text file containing email addresses (one each line), 2nd file is csv file with 5-6 columns. Script should take search input from file1 and search in file 2, the output should be stored in another csv file (only first 3 columns) see example below. Also I have copied a script that I was working on. If there is a better/efficient script then please let me know. Thank you, appreciate your help.
File1 (output.txt)
rrr#company.com
eee#company.com
ccc#company.com
File2 (final.csv)
Sam,Smith,sss#company.com,admin
Eric,Smith,eee#company.com,finance
Joe,Doe,jjj#company.com,telcom
Chase,Li,ccc#company.com,IT
output (out_name_email.csv)
Eric,Smith,eee#company.com
Chase,Li,ccc#company.com
Here is the script
import csv
outputfile = 'C:\\Python27\\scripts\\out_name_email.csv'
inputfile = 'C:\\Python27\\scripts\\output.txt'
datafile = 'C:\\Python27\\scripts\\final.csv'
names=[]
with open(inputfile) as f:
for line in f:
names.append(line)
with open(datafile, 'rb') as fd, open(outputfile, 'wb') as fp_out1:
writer = csv.writer(fp_out1, delimiter=",")
reader = csv.reader(fd, delimiter=",")
headers = next(reader)
for row in fd:
for name in names:
if name in line:
writer.writerow(row)

Load the emails into a set for O(1) lookup:
with open(inputfile) as fin:
emails = set(line.strip() for line in fin)
Then loop over the rows once, and check it exists in emails - no need to loop over each possible match for each row:
# ...
for row in reader:
if row[1] in emails:
writer.writerow(row)
If you're not doing anything else, then you can make it:
writer.writerows(row for row in reader if row[1] in emails)
A couple of notes, in your original code you're not using the csv.reader object reader - you're looping over fd and you appear to have some naming issues with names and line and row...

Related

Delete rows from csv file using function in Python

def usunPsa(self, ImiePsa):
with open('schronisko.csv', 'rb') as input, open('schronisko.csv', 'wb') as output:
writer = csv.writer(output)
for row in csv.reader(input):
if row[0] == ImiePsa:
writer.writerow(row)
with open(self.plik, 'r') as f:
print(f.read())
Dsac;Chart;2;2020-11-04
Dsac;Chart;3;2020-11-04
Dsac;Chart;4;2020-11-04
Lala;Chart;4;2020-11-04
Sda;Chart;4;2020-11-04
Sda;X;4;2020-11-04
Sda;Y;4;2020-11-04
pawel;Y;4;2020-11-04`
If I use usunPsa("pawel") every line gets removed.
Following code earse my whole csv file instead only one line with given ImiePsa,
What may be the problem there?
I found the problem. row[0] in your code returns the entire row, that means the lines are not parsed correctly. After a bit of reading, I found that csv.reader has a parammeter called delimiter to sepcify the delimiter between columns.
Adding that parameter solves your problem, but not all problems though.
The code that worked for me (just in case you still want to use your original code)
import csv
def usunPsa(ImiePsa):
with open('asd.csv', 'rb') as input, open('schronisko.csv', 'wb') as output:
writer = csv.writer(output)
for row in csv.reader(input, delimiter=';'):
if row[0] == ImiePsa:
writer.writerow(row)
usunPsa("pawel")
Notice that I changed the output filename. If you want to keep the filename the same however, you have to use Hamza Malik's answer.
Just read the csv file in memory as a list, then edit that list, and then write it back to the csv file.
lines = list()
members= input("Please enter a member's name to be deleted.")
with open('mycsv.csv', 'r') as readFile:
reader = csv.reader(readFile)
for row in reader:
lines.append(row)
for field in row:
if field == members:
lines.remove(row)
with open('mycsv.csv', 'w') as writeFile:
writer = csv.writer(writeFile)
writer.writerows(lines)

add a new line after a specific line of a csv file in python3

I am writing a python code in which a csv file is read and some information are written in. I should find one specific row and add a new line of data after it, at this stage. I have succeeded finding the row but I can not write the new line of data after it. Here is my attempt:
file = open('db.csv', 'r+')
table = csv.reader(file)
for row in table:
if(row == ['tbl']):
file.seek(len(row)) #this part is the problem I suppose
break
table = csv.writer(file)
table.writerow(['1', '2'])
Using file.seek / file.tell is tricky because csv.reader could read ahead; cannot tell exact file position that match current row.
Also inserting is not trivial; you need to remember remaing parts.
I would do it following way:
creating another file for write
write according to your need
once done, replace the old file with new file
import csv
import shutil
with open('db.csv', 'r', newline='') as f, open('db.csv.temp', 'w', newline='') as fout:
reader = csv.reader(f)
writer = csv.writer(fout)
for row in reader:
writer.writerow(row)
if row == ['tbl']:
writer.writerow([]) # empty line
shutil.move('db.csv.temp', 'db.csv')

Removing blank spaces from a CSV file without creating a new file

I have blank spaces in a csv sheet that I want to get rid of it.
After searching for hours I realized that this is the code for it:
input = open('file.txt', 'wb')
output = open('new_file.txt', 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
if any(field.strip() for field in row):
writer.writerow(row)
input.close()
output.close()
My question is: How do I remove the blank spaces without having to create a new file?
You can first extract the valid rows and overwrite the file afterwards, provided your file is not too big and thus the rows can fit in the memory entirely
with open('file.txt', 'rb') as inp:
valid_rows = [row for row in csv.reader(inp) if any(field.strip() for field in row)]
with open('file.txt', 'wb') as out:
csv.writer(out).writerows(valid_rows)

How can I read a CSV file line by line while keeping track of the column headers?

When using this code:
with open(filepath, 'r') as f:
reader = csv.reader(f)
for i, line in enumerate(reader):
print 'line[{}] = {}'.format(i, line)
It reads my CSV files line by line, but I can't select the line I want by its header. The index will probably change from file to file, so I feel this wouldn't be a good way to select the column I want in a row. What's a good way to approach this?
From the csv documentation, use DictReader instead of just reader. Updating your implementation to:
import csv
with open(filename, 'r') as f:
reader = csv.DictReader(f)
for i, line in enumerate(reader):
print 'line[{}] = {}'.format(i, line['header_name'])
Documentation on DictReader found here: https://docs.python.org/2/library/csv.html#csv.DictReader

Combine two csv into a single one with page breaks in python

So I'm trying to combine two different csv files into a single one and I've done that. The two csv files are of students in school who are present in 1 and absent in another.
I need to put the date the file was created at the top of the new csv and have each grade of the present students on a new page or after 3 blank rows.
Also on each new page or after each 3 blanks i want to have the name or the teacher, the date on which the file was created and the grade.
import csv
with open('inschool.csv', encoding="cp437") as f:
reader = csv.reader(f)
in_school = list(reader)
with open('notinschool.csv', encoding="cp437") as f:
reader = csv.reader(f)
not_in_school = list(reader)
for grade, name, status, hr_teacher in not_in_school:
print(grade, name, status, hr_teacher)
for grade, name, status, hr_teacher in in_school:
print(grade, name, status, hr_teacher)
iFile = open('inschool.csv', encoding="cp437")
reader = csv.reader(iFile)
IFILE = open('notinschool.csv', encoding="cp437")
READER = csv.reader(IFILE)
oFile = open('combined.csv','wt',encoding="cp437")
writer = csv.writer(oFile, delimiter='|', quoting=csv.QUOTE_ALL)
for row in READER:
writer.writerow(row)
writer.writerow("[]")
for row in reader:
writer.writerow(row)
writer.writerow("[]")
The code which i tried for the 3 blank rows had this ending but it gave 3 blank rows/lines after each students name instead of after each grade.
iFile = open('Inschool.csv',)
reader = csv.reader(iFile)
IFILE = open('notinschool.csv')
READER = csv.reader(IFILE)
oFile = open('combined.csv','wb')
writer_a = csv.writer(oFile, delimiter='|', quoting=csv.QUOTE_ALL)
writer_b = csv.writer(oFile, delimiter='|', quoting=csv.QUOTE_ALL, lineterminator="\n\n\n\n")
for row in READER:
writer_a.writerow(row)
writer_b.writerow([])
for row in reader:
writer_b.writerow(row)
I would appreciate it if someone could help me. Thanks.
You can do it really easy in the terminal. Just cd to the directory and do the command cat inschool.csv notinschool.csv > combined.csv
If you want to do it in Python I would do:
in_file1 = open("inschool.csv","r").read().split("\n")
in_file2 = open("notinschool.csv","r").read().split("\n")
out_file = open("combined.csv","w")
for line in in_file1:
if line:
out_file.write(line + "\n")
for line in in_file2:
if line:
out_file.write(line + "\n")
reading files the way above isn't the most efficient, but if they are small it doesnt really matter and it's easier to visualize what's happening. you can use your input file method with this b/c the concept stays the same :)
I just got into using this module called pandas and it is for DataFrames. They are much easier to use, process, navigate through, and merge than parsing text files.

Categories