Finding a name in a CSV file - python

I have code that finds the input name in a CSV if it is present it says yes else no. But I entered a name present in the CSV yet it still says no.
Here is the code:
import csv
f=open("student.csv","r")
reader=csv.reader(f)
for row in reader:
print
studentToFind = raw_input("Enter the name of sudent?")
if studentToFind in reader:
print('yes')
else:
print('no')
f.close()

Simply ask the question before you loop over the file:
import csv
studentToFind = raw_input("Enter the name of student?")
f=open("student.csv","r")
reader=csv.reader(f)
found = "No"
for row in reader:
if studentToFind in row:
found = "Yes"
f.close()
print('{}'.format(found))

You've got a couple of issues:
First reader is empty at this point, since you've already looped over its elements. Reading from a file is a one-time deal, if you want to access its contents more than once you need to write it to a data structure, e.g.:
rows = []
with open("student.csv", newline='') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
rows.append(row)
However this also won't be sufficient, because rows is now a 2D list, as each row the reader returns is itself a list. An easy way to search for a value in nested lists is with list comprehensions:
if studentToFind in [cell for row in rows for cell in row]:
print('yes')
else:
print('no')
Put it together, so the indentation's easier to see:
rows = []
with open("student.csv", newline='') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
rows.append(row)
if studentToFind in [cell for row in rows for cell in row]:
print('yes')
else:
print('no')

You've already iterated over the file once. When you try to loop over reader again there is nothing to loop over.
Instead don't even use the csv module and save the lines in the file into a list:
with open("student.csv","r") as f:
lines = []
for line in f:
lines.append(line.rstrip())
studentToFind = raw_input("Enter the name of student?")
if studentToFind in lines:
print('yes')
else:
print('no')

Related

Python CSV: IndexError: List Index out of range

I have very weird problem in working with csv. My code is :
with open('CFD.csv', 'rt') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
if cfd_number == row[0]:
cfd_checked_before = "Yes"
This code is working in Mac but in windows, I get the following error:
IndexError: List Index out of range
Its common to have empty lines in csv files, especially at the end of the file. If you want your code to accept this common error, just check the row list before use.
with open('CFD.csv', 'rt') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
if row:
if cfd_number == row[0]:
cfd_checked_before = "Yes"
You can also use filter for the same task. When its first parameter is None, it removes "falsey" things like empty lists:
with open('CFD.csv', 'rt') as f:
reader = csv.reader(f, delimiter=',')
for row in filter(None, reader):
if cfd_number == row[0]:
cfd_checked_before = "Yes"
There are maybe some possible cases there...
Maybe you are opening a csv file in Windows not even alike as the csv file you are using in Mac
Maybe the problem there is in the line row[0]
Maybe your csv file does not contains any comma delimiter or some rows there has an empty line as stated.
try printing the row variable or even the reader variable

Python find matching string in each line

I would like to read each row of the csv file and match each word in the row with a list of strings. If any of the strings appears in the row, then write that string at the end of the line separated by comma.
The code below doesn't give me what I want.
file = 'test.csv'
read_files = open(file)
lines=read_files.read()
text_lines = lines.split("\n")
name=''
with open('testnew2.csv','a') as f:
for line in text_lines:
line=str(line)
#words = line.split()
with open('names.csv', 'r') as fd:
reader = csv.reader(fd, delimiter=',')
for row in reader:
if row[0] in line:
name=row
print(name)
f.write(line+","+name[0]+'\n')
A sample of test.csv would look like this:
A,B,C,D
ABCD,,,
Total,Robert,,
Name,Annie,,
Total,Robert,,
And the names.csv would look:
Robert
Annie
Amanda
The output I want is:
A,B,C,D,
ABCD,,,,
Total,Robert,,,Robert
Name,Annie,,,Annie
Total,Robert,,,Robert
Currently the code will get rid of lines that don't result in a match, so I got:
Total,Robert,,,Robert
Name,Annie,,,Annie
Total,Robert,,,Robert
Process each line by testing row[1] and appending the 5th column, then writing it. The name list isn't really a csv. If it's really long use a set for lookup. Read it only once for efficiency as well.
import csv
with open('names.txt') as f:
names = set(f.read().strip().splitlines())
# newline='' per Python 3 csv documentation...
with open('input.csv',newline='') as inf:
with open('output.csv','w',newline='') as outf:
r = csv.reader(inf)
w = csv.writer(outf)
for row in r:
row.append(row[1] if row[1] in names else '')
w.writerow(row)
Output:
A,B,C,D,
ABCD,,,,
Total,Robert,,,Robert
Name,Annie,,,Annie
Total,Robert,,,Robert
I think the problem is you're only writing when the name is in the row. To fix that move the writing call outside of the if conditional:
if row[0] in line:
name=row
print(name)
f.write(line+","+name[0]+'\n')
I'm guessing that print statement is for testing purposes?
EDIT: On second thought, you may need to move name='' inside the loop as well so it is reset after each row is written, that way you don't get names from matched rows bleeding into unmatched rows.
EDIT: Decided to show an implementation that should avoid the (possible) problem of two matched names in a row:
EDIT: Changed name=row and the call of name[0] in f.write() to name=row[0] and a call of name in f.write()
file = 'test.csv'
read_files = open(file)
lines=read_files.read()
text_lines = lines.split("\n")
with open('testnew2.csv','a') as f:
for line in text_lines:
name=''
line=str(line)
#words = line.split()
with open('names.csv', 'r') as fd:
reader = csv.reader(fd, delimiter=',')
match=False
while match == False:
for row in reader:
if row[0] in line:
name=row[0]
print(name)
match=True
f.write(line+","+name+'\n')
Try this as well:
import csv
myFile = open('testnew2.csv', 'wb+')
writer = csv.writer(myFile)
reader2 = open('names.csv').readlines()
with open('test.csv') as File1:
reader1 = csv.reader(File1)
for row in reader1:
name = ""
for record in reader2:
record = record.replace("\n","")
if record in row:
row.append(record)
writer.writerow(row)
break

For Loop on csv file Python

I'm writing a program that runs over a csv file and need to check if one of the lines in the csv file equals to the string iv'e decided but it is not working.
import csv
f= open('myfile.csv')
csv_f = csv.reader(f)
x = 'www.google.com'
for row in csv_f:
if row[index] == x :
print "a"
else:
print row
What is index? You want to check first value for equality, or iterate over each value in row? PS. You should close file at the end, or, better, use with statement.
with open(filename) as f:
csv_file = csv.reader(f)
for row in csv_file:
...

Python not entering in for

Why the unique[1] is never accessed in the second for???
unique is an array of strings.
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
for i in range(len(unique)):
# print unique[i] #prints all the items in the array
for row in reader:
print unique[i] # always prints the first item unique[0]
if row[1]==unique[i]:
print row[1], row[0] # prints only the unique[0] stuff
Thank you
I think it would be useful to go through the program flow.
First, it will assign i=0, then it will read the entire CSV file, printing unique[0] for each line in the CSV file, then after it finishes reading the CSV file, it will go to the second iteration, assigning i=1, and then since the program has finished reading the file, it won't enter for row in reader:, hence it exits the loop.
Further Clarification
The csv.reader(f) won't actually read the file until you do for row in reader, and after that it has nothing more to read. If you want to read the file multiple times, then read it into a list first beforehand, like this:
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
rows = [row for row in reader]
for i in range(len(unique)):
for row in rows:
print unique[i]
if row[1]==unique[i]:
print row[1], row[0]
I think you might have better luck if you change your nested structure to:
import csv
res = {}
for x in unique:
res[x] = []
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
for i in range(len(unique)):
# print unique[i] #prints all the items in the array
if row[1]==unique[i]:
res[unique[i]].append([row[1],row[0]])
#print row[1], row[0] # prints only the unique[0] stuff
for x in unique:
print res[x]

Matching a string with cell in CSV file and returning adjacent cells

I have the following code:
for i in self.jobs:
with open('postcodes.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
if row[0] == self.jobs[i][3]:
self.jobs[i].append((row[1],row[2]))
else:
self.jobs[i].append('lat & lng not available)
My problem is this produces "lat & lng not available" for each row in the csv file, I only want to know if it matches give me the info from the adjacent two rows, if it doesn't, give me the 'lat & lng not available'.
See http://pastebin.com/gX5HtJV4 for full code
SSCCE could be as follows:
reader = [('HP2 4AA', '51.752927', '-0.470095'), ('NE33 3GA', '54.991663', '-1.414911'), ('CV1 1FL','52.409463', '-1.509234')]
selfjobs = ['NE33 3AA', 'CV1 1FL', 'HP2 4AA']
latlng = []
for row in reader:
for i in selfjobs:
if i in row[0]:
latlng.append((row[1],row[2]))
else:
latlng.append(('not available','not available'))
print latlng
Following Martineau's help in the comments, this is the code I ended up with:
for i in self.jobs:
job = self.jobs[i]
postcode = job[3]
home = (54.764919,-1.368824)
with open('postcodes.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
postcode_csv = row[0]
if postcode in postcode_csv:
job.append((row[1], row[2]))
else:
job.append(home)
I think at least part of the problem is that you actually have the following in your pastebin code:
for i in self.jobs:
with open('postcodes.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
if row[0] == self.jobs[i][3]:
self.jobs[i].append((row[1], row[2]))
elif self.jobs[i][3] != row[0]:
self.jobs[i].append("nothing")
However, since theiin thefor i in self.jobsloop is itself alist, it can't be used as a index intoself.jobs like that. Instead, I think it would make more sense to be doing something like the following in the loop:
for job in self.jobs:
with open('postcodes.csv', 'rb') as f:
for row in csv.reader(f):
if row[0] == job[3]:
job.append((row[1], row[2]))
break
else: # no match
job.append("nothing")
...which only indexes the fields of data in the rows read in from the csv file. For efficiency, it stops reading the file as soon as it finds a match. If it ever reads through whole file without finding a match, it appends"nothing"to indicate this, which is what theelseclause of the innerforloop is doing.
BTW, it also seems rather inefficient to open and potentially read through the entirepostcodes.csv file for every entry inself.jobs, so you might want to consider reading the whole thing into a dictionary, once, before executing thefor job in self.jobs:loop (assuming the file's not too large for that).

Categories