Python csv to dictionary with first line as title - python

I have a file file.csv with some data:
fn,ln,tel
john,doe,023322
jul,dap,024322
jab,sac,0485
I would like to have an array that I can access like this:
file = 'file.csv'
with open(file,'rU') as f:
reader = csv.DictReader(f)
print reader[0].fn
So I would like that it prints the first name from the first record. Unfortunately, I get this error:
ValueError: I/O operation on closed file
How can I get it done so that I don't need to keep the file opened and that I can play with my array. Btw, I don't need to write back in the csv file, I just need to use the data and for that, an array that I can modify would be best.

You need to access the reader *within the with block, not outside of it:
file = 'file.csv'
with open(file,'rU') as f:
reader = csv.DictReader(f)
first_row = next(reader)
print first_row['fn']
As soon as you move code outside the block, the f file object is closed and you cannot obtain rows from the reader anymore. This is kind of the point of the with statement.
If you want to have random access to all rows in the file, convert the reader to a list first:
file = 'file.csv'
with open(file,'rU') as f:
reader = csv.DictReader(f)
all_rows = list(reader)
print all_rows[0]['fn']
The list() call will iterate over the reader, adding each result yielded to the list object until all rows are read. Make sure you have enough memory to hold all those rows.

Related

Writing in separate columns in csv python

from operator import itemgetter
COLS = 15,21,27
COLS1 = 16,22,28
filename = "result.csv"
getters = itemgetter(*(col-1 for col in COLS))
getters1 = itemgetter(*(col-1 for col in COLS1))
with open('result.csv', newline='') as csvfile:
for row in csv.reader(csvfile):
row = zip(getters(row))
for row1 in csv.reader(csvfile):
row1 = zip(getters1(row1))
print(row)
print(row1)
with open('results1.csv', "w", newline='') as f:
fieldnames = ['AAA','BBB']
writer = csv.writer(f,delimiter=",")
for row in row:
writer.writerow(row)
writer.writerow(row1)
I am getting a NameError: name 'row1' is not defined error. I want to write each of the COLS in a separate column in the results1 file. How would I go about this?
So, there are few things going on in the code that are potentially leading to errors.
First is the way csv.reader(csvfile) works in python. When reading the file with csv.reader it basically scans the next line in the file when called and returns it. The csv part in this case simply uses the .cvs format and returns the data in a list, rather than a simple string of text in the standard python file reader. This is fine for a lot of use cases, but the issue here we are running into, is that when you run:
for row in csv.reader(csvfile):
row = zip(getters(row))
the csv.reader(csvfile) gets called for every row in the entire file and the for loop only stops when it runs out of data in the "results.csv" file. Meaning, if you want to use the data from each row, you need to store it in some way before running out the file. I think that's what you are trying to achieve with row = zip(getters(row)) but the issue here is row is both being assigned to zip(getters(row)) and being used as the variable in the for loop. Since you are essentially calling csv.reader, returning to variable row, then reassigning row to being zip(getters(row)), you are just writing over the variable row every iteration of the for loop and the result is nothing gets stored.
In order to store your csv data, try this:
data = [[]]
for row in csv.reader(csvfile):
temp = zip(getters(row))
data.append(temp)
This will store your row in a list called data.
Then, the second error is the one you are asking about, which is row1 not being defined. This happened in your code because the for loop ran through every row in the csv file. When you then call csv.reader again in the second for loop it can't read anything because the first for loop already read through the entire csv file and it doesn't know to start over at the beginning of the file. Therefore, row1 never gets declared or assigned, and therefore when you call again it in writer.writerow(row1), row1 doesn't exist.
There a couple ways to fix this. You could close the file, reopen it again and start from the beginning of the file again. Or you could store it at the same time in the first for loop. So like this:
data = [[]]
data1 = [[]]
for row in csv.reader(csvfile):
temp = zip(getters(row))
data.append(temp)
temp2 = zip(getters1(row))
data2.append(temp2)
Now you will have 3 columns of data in both data and data1.
Now for writing to the "results1.csv" file. Here you used row as the for loop variable as well as the iterable to run through, which does not work. Also, you call writer.writerow(row) then writer.writerow(row1), which also doesn't work. Try this instead:
with open('results1.csv', "w", newline='') as f:
writer = csv.writer(f,delimiter=",")
for row in range(len(data)):
writer.writerow(data[row] + data1[row])
Now it also looks like you want to add headers for each column in fieldnames = ['AAA','BBB'] . Unfortunetly, csv.writer does not have an easy way to do this, instead csv.DictWriter and writer.writeheader() must be used first.
with open('results1.csv', "w", newline='') as f:
fieldnames = ['A','A','A','B','B','B']
writer = csv.DictWriter(f,delimiter=",", fieldnames=fieldnames)
writer.writeheader()
writer = csv.writer(f,delimiter=",")
for row in range(len(data)):
writer.writerow(data[row] + data1[row])
Hope this helps!

Append Data to the end of a row in a csv file (Python)

I am attempting to append 4 elements to the end of a row in a csv file
Original
toetag1,tire11,tire12,tire13,tire14
Desired Outcome
toetag1,tire11,tire12,tire13,tire14,wtire1,wtire2,wtire3,wtire4
I attempted to research ways to do this how ever most search results yielded results such as "how to append to a csv file in python"
Can someone direct me in the correct way to solve this problem
I advise you to use pandas module and read_csv method.
You can use the following code for instance :
data = pandas.read_csv("your_file.csv")
row = data.iloc[0].to_numpy()
row.append(['wtire1','wtire2','wtire3','wtire4'])
You can read the csv file to a list of lists and do the necessary manipulation before writing it back.
import csv
#read csv file
with open("original.csv") as f:
reader = csv.reader(f)
data = [row for row in reader]
#modify data
data[0] = data[0] + ["wtire1", "wtire2", "wtire3", "wtire4"]
#write to new file
with open("output.csv", "w") as f:
writer = csv.writer(f)
writer.writerows(data)

Updating a specific csv column based on randomname

My code pulls a random name from a csv file. When a button is pressed i want my code to search through the csv file, and update the cell next to the name generated previously in the code.
The variable in which the name is stored in is called name
The index which pulls the random name from the csv file is stored in the variable y
The function looks like this. I have asked this question previously however have had no luck in receiving answers, so i have made edits to the function and hopefully made it more clear.
namelist_file = open('StudentNames&Questions.csv')
reader = csv.reader(namelist_file)
writer = csv.writer(namelist_file)
rownum=0
array=[]
for row in reader:
if row == name:
writer.writerow([y], "hello")
Only the first two columns of the csv file are relevant
This is the function which pulls a random name from the csv file.
def NameGenerator():
namelist_file = open('StudentNames&Questions.csv')
reader = csv.reader(namelist_file)
rownum=0
array=[]
for row in reader:
if row[0] != '':
array.append(row[0])
rownum=rownum+1
length = len(array)-1
i = random.randint(1,length)
global name
name = array[i]
return name
There are a number of issues with your code:
You're trying to have both a reader object and a writer on the same file at the same time. Instead, you should read the file contents in, make any changes necessary and then write the whole file back out at the end.
You need to open the file in write mode in order to actually make changes to the contents. Currently, you don't specify what mode you're using so it defaults to read mode.
row is actually a list representing all data in the row. Therefore, it cannot be equal to the name you're searching, only the 0th index might be.
The following should work:
with open('StudentNames&Questions.csv', 'r') as infile:
reader = csv.reader(infile)
data = [row for row in reader]
for row in data:
if row[0] == name:
row[1] += 1
with open('StudentNames&Questions.csv', 'w', newline='') as outfile:
writer = csv.writer(outfile)
writer.writerows(data)

What is the best way to overwrite a specific row in a csv by its index in Python 2.7

I have a python script that appends 4 strings to the end of my csv file. The first column is the user's email address, and I want to search the csv to see if that users email address is already in the file, if it is I want to overwrite that whole row with my 4 new strings, but if not I want to continue to just append it to the end. I have it searching the first column for the email, and if it is there it will give me the row.
with open('Mycsvfile.csv', 'rb') as f:
reader = csv.reader(f)
indexLoop = []
for i, row in enumerate(reader):
if userEmail in row[0]:
indexLoop.append(i)
f.close()
with open("Mycsvfile.csv", 'ab') as file222:
writer = csv.writer(file222, delimiter=',')
lines = (userEmail, userDate, userPayment, userStatus)
writer.writerow(lines)
file222.close()
I want to do something like this, if email is in row it will give me the row index and I can use that to overwrite the whole row with my new data. If it isn't there I will just append the file at the bottom.
Example:
with open('Mycsvfile.csv', 'rb') as f:
reader = csv.reader(f)
new_rows = []
indexLoop = []
for i, row in enumerate(reader):
if userEmail in row[0]:
indexLoop.append(i)
new_row = row + indexLoop(userEmail, userDate, userPayment, userStatus)
new_rows.append(new_row)
else:
print "userEmail doesn't exist"
#(i'd insert my standard append statement here.
f.close
#now open csv file and writerows(new_row)
For this, you're better off using Pandas, rather than the csv module. That way you can read the whole file into memory, modify it, and then write it back to a file.
Be aware though that, modify DataFrames in place is slow, so if you have a lot of data to add, you're better of transforming it in into a dictionary and back.
import pandas as pd
file_path = r"/Users/tob/email.csv"
columns = ["email", "foo", "bar", "baz"]
df = pd.read_csv(file_path, header=None, names=columns, index_col="email")
data = df.to_dict('index')
for email, foo, bar, baz in information:
row = {"foo": foo, "bar": bar, "baz"}
data[email] = row
df = pd.DataFrame(data)
df.to_csv(file_path)
Where information is whatever your script returned.
First you don't need to call the close function when using with, python does it for you.
If you have the index you can do:
with open("myFile.csv", "r+") as f:
# gives you a list of the lines
contents = f.readlines()
# delete the old line and insert the new one
contents.pop(index)
contents.insert(index, value)
# join all lines and write it back
contents = "".join(contents)
f.write(contents)
But I would recommand you to do all the operations in one function because it doesn't make a lot of sense to open the file, iterate on its lines, close it, reopen it and updating it.

Writing a filtered CSV file to a new file and iterating through a folder

I have been trying initially to create a program to go through one file and select certain columns that will then be moved to a new text file. So far I have
import os, sys, csv
os.chdir("C://Users//nelsonj//Desktop//Master_Project")
with open('CHS_2009_test.txt', "rb") as sitefile:
reader = csv.reader(sitefile, delimiter=',')
pref_cols = [0,1,2,4,6,8,10,12,14,18,20,22,24,26,30,34,36,40]
for row in reader:
new_cols = list(row[i] for i in pref_cols)
print new_cols
I have been trying to use the csv functions to write the new file but I am continuosly getting errors. I will eventually need to do this over a folder of files, but thought I would try to do it on one before tackling that.
Code I attempted to use to write this data to a new file
for row in reader:
with open("CHS_2009_edit.txt", 'w') as file:
new_cols = list(row[i] for i in pref_cols)
newfile = csv.writer(file)
newfile.writerows(new_cols)
This kind of works in that I get a new file, but in only prints the second row of values from my csv, i.e., not the header values and places commas in between each individual character, not just copying over the original columns as they were.
I am using PythonWin with Python 2.6(from ArcGIS)
Thanks for the help!
NEW UPDATED CODE
import os, sys, csv
path = ('C://Users//nelsonj//Desktop//Master_Project')
for filename in os.listdir(path):
pref_cols = [0,1,2,4,6,8,10,12,14,18,20,22,24,26,30,34,36,40]
with open(filename, "rb") as sitefile:
with open(filename.rsplit('.',1)[0] + "_Master.txt", 'w') as output_file:
reader = csv.reader(sitefile, delimiter=',')
writer = csv.writer(output_file)
for row in reader:
new_row = list(row[i] for i in pref_cols)
writer.writerow(new_row)
print new_row
Getting list index out of range for the new_row, but it seems to still be processing the file. Only thing I can't get it to do now is loop through all files in my directory. Here's a hyperlink to Screenshot of data text file
Try this:
new_header = list(row[i] for i in pref_cols if i in row)
That should avoid the error, but it may not avoid the underlying problem. Would you paste your CSV file somewhere that I can access, and I'll fix this for you?
For your purpose of filtering, you don't have to treat the header differently from the rest of the data. You can go ahead remove the following block:
headers = reader.next()
for row in headers:
new_header = list(row[i] for i in pref_cols)
print new_header
Your code did not work because you treated headers as a list of rows, but headers is just one row.
Update
This update deals with writing the CSV data to a new file. You should move the open statement above the for row...
with open("CHS_2009_edit.txt", 'w') as output_file:
writer = csv.writer(output_file)
for row in reader:
new_cols = list(row[i] for i in pref_cols)
writer.writerows(new_cols)
Update 2
This update deals with the header output problem. If you followed my suggestions, you should not have this problem. I don't know what your current code looks like, but it looks like you supplies a string where the code expects a list. Here is the code that I tried on my system (using my made-up data) and it seems to work:
pref_cols = [...] # <<=== Should be set before entering the loop
with open('CHS_2009_test.txt', "rb") as sitefile:
with open('CHS_2009_edit.txt', 'w') as output_file:
reader = csv.reader(sitefile, delimiter=',')
writer = csv.writer(output_file)
for row in reader:
new_row = list(row[i] for i in pref_cols)
writer.writerow(new_row)
One thing to notice: I use writerow() to write a single row, where you use writerows() -- that makes a difference.

Categories