Adding data to a CSV file through Python - python

I've got a function that adds new data to a csv file. I've got it somewhat working. However, I'm having a few problems.
When I add the new values (name, phone, address, birthday), it adds them all in one column, rather than separate columns in the same row. (Not really much idea on how to split them up in various columns...)
I can only add numbers rather than string values. So if I write add_friend(blah, 31, 12, 45), it will come back saying blah is not defined. However, if I write add_friend(3,4,5,6), it'll add that to the new row—but, into a single column
An objective with the function is: If you try and add a friend that's already in the csv (say, Bob), and his address, phone, birthday are already in the csv, if you add_friend(Bob, address, phone, birthday), it should state False, and not add it. However, I have no clue how to do this. Any ideas?
Here is my code:
def add_friend (name, phone, address, birthday):
with open('friends.csv', 'ab') as f:
newrow = [name, phone, address, birthday]
friendwriter = csv.writer(open('friends.csv', 'ab'), delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
friendwriter.writerow(newrow)
#friendreader = csv.reader(open('friends.csv', 'rb'), delimiter=' ', quotechar='|')
#for row in friendreader:
#print ' '.join(row)
print newrow

Based on your requirements, and what you appear to be trying to do, I've written the following. It should be verbose enough to be understandable.
You need to be consistent with your delimiters and other properties when reading the CSV files.
Also, try and move "friends.csv" to a global, or at least in some non-hard coded constant.
import csv
def print_friends():
reader = csv.reader(open("friends.csv", "rb"), delimiter=' ', quotechar='|', quoting=csv.QUOTE_MINIMAL)
for row in reader:
print row
def friend_exists(friend):
reader = csv.reader(open("friends.csv", "rb"), delimiter=' ', quotechar='|', quoting=csv.QUOTE_MINIMAL)
for row in reader:
if (row == friend):
return True
return False
def add_friend(name, phone, address, birthday):
friend = [name, phone, address, birthday]
if friend_exists(friend):
return False
writer = csv.writer(open("friends.csv", "ab"), delimiter=' ', quotechar='|', quoting=csv.QUOTE_MINIMAL)
writer.writerow(friend)
return True
print "print_friends: "
print_friends()
print "get_friend: "
test_friend = ["barney", "4321 9876", "New York", "2000"]
print friend_exists(test_friend)
print "add_friend: "
print add_friend("barney", "4321 9876", "New York", "2000")

It doesn't do that. What makes you think that's what it does? It's possible that the quoting scheme you really want isn't the one you specified to csv.writer: i.e., spaces delimit columns, and | is the quoting character.
blah is not a string literal, "blah" is. blah without quotes is a variable reference, and the variable didn't exist here.
In order to check whether a name is already in the CSV file, you have to read the whole CSV file first, checking for the details. Open the file twice: first for reading ('r'), and use csv.reader to turn that into a row-iterator and find all the names. You can add those names to a set, and then check with that set forever after.
re #3:
To get this set, you could define a function as so:
def get_people():
with open(..., 'r') as f:
return set(map(tuple, csv.reader(f)))
And then if you assigned the set somewhere, e.g. existing_people = get_people()
you could then check against it when adding new people, as follows:
newrow = (name, phone, address, birthday)
if newrow in existing_people:
return False
else:
existing_people.add(newrow)
friendwriter.writerow(newrow)

You aren't stating how experienced with Python you already are, so I am aiming this a little low - no offence intended
There are several "requirements" for your homework. In general, you should ensure that one function does one thing. So, to meet all your requirements, you’ll need several functions; look at creating at least one module (i.e., a file with functions in it).
A space delimiter and a | for quotes is pretty unusual. For the current file, what is the delimieter between columns? And what is used to quote/escape text? (By “escaping text”, I mean: If I have a csv file that uses commas as the column delimiter, and I want to put a sentence with commas into just one column, I need to tell the difference between a comma that means “new column” and a comma that is part of a sentence in a column. Microsoft decided that Excel would support double quotes—so "hello, sailor" became a de facto standard.
If you want to know if "bob brown” is already in the file, you will need to read the whole file first before trying to insert. You can do this using 'r', then 'w'. But should you read the whole file every time you want to insert one record? What if you have a hundred records to add—should you read the whole file each time? Is there a way to store the names during the adding process?
blah is not a string. It needs to be quoted to be a string literal ("blah"). blah just refers to a variable whose name is blah. If it says blah is not defined, that’s because you have not declared the variable blah to hold anything.

Related

Row in Excel to array?

I have lots of data in an Excel spreadsheet that I need to import using Python. i need each row to be read as an array so I can call on the first data point in a specified row, the second, the third, and so on.
This is my code so far:
from array import *
import csv
with open ('vals.csv', 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter=' ', quotechar='|')
reader_x = []
reader_y = []
reader_z = []
row = next(reader)
reader_x.append(row)
row = next(reader)
reader_y.append(row)
row = next(reader)
reader_z.append(row)
print reader_x
print reader_y
print reader_z
print reader_x[0]
It is definitely storing it as an array I think. But I think it is storing the entire row of Excel as a string instead of each block being a separate data point, because when I tell Python to print an entire array it looks something like this (a shortened version because there's like a thousand in each row):
[['13,14,12']]
And when I tell it to print reader_x[0] (or any of the other two for that matter) it looks like this:
['13,14,12']
But when I tell it to print anything beyond the 0th thing in the array, it just gives me an error because it's out of range.
How can I fix this? How can I make it [13,14,12] instead of ['13,14,12'] so I can actually use these numbers in calculation? (I want to avoid downloading any more libraries if I can because this is for a school thing and I need to avoid that if possible.)
I have been stuck on this for several days and nothing I can find has worked for me and half of it I didn't even understand. Please try to explain simply if you can, as if you're talking to someone who doesn't even know how to print "Hello World".
You can use split to do this and use , as a separator.
For example:
row = '11,12,13'
row = row.split(',')
It is a csv, (comma separated values) try setting delimiter to ','
You don't need from array import * ... What the rest of the world calls an array is called a list in Python. The Python array is rather specialised and you are not actually using it so just delete that line of code.
As others have pointed out, you need incoming lines to be split. The csv default delimiter is a comma. Just let csv.reader do the job, something like this:
reader = csv.reader(csvfile)
data = [map(int, row) for row in reader]

How to replace substrings like NAME in a string template?

I've got a string containing substrings I'd like to replace, e.g.
text = "Dear NAME, it was nice to meet you on DATE. Hope to talk with you and SPOUSE again soon!"
I've got a csv of the format (first row is a header)
NAME, DATE, SPOUSE
John, October 1, Jane
Jane, September 30, John
...
I'm trying to loop through each row in the csv file, replacing substrings in text with the csv element from the column with header row matching the original substring. I've got a list called matchedfields which contains all the fields that are found in the csv header row and text (in case there are some columns in the csv I don't need to use). My next step is to iterate through each csv row and replace the matched fields with the element from that csv column. To accomplish this, I'm using
with open('recipients.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
for match in matchedfields:
print inputtext.replace(match, row[match])
My problem is that this only replaces the first matched substring in text with the appropriate element from the csv. Is there a way to make multiple replacements simultaneously so I end up with
"Dear John, it was nice to meet you on October 1. Hope to talk with you and Jane again soon!"
"Dear Jane, it was nice to meet you on September 30. Hope to talk with you and John again soon!"
I think the real way to go here is to use string templates. Makes your life easy.
Here is a general solution that works under Python2 and 3:
import string
template_text = string.Template(("Dear ${NAME}, "
"it was nice to meet you on ${DATE}. "
"Hope to talk with you and ${SPOUSE} again soon!"))
And then
import csv
with open('recipients.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
print(template_text.safe_substitute(row))
Now, I noticed that your csv is kind of messed up with whitespaces, so you'll have to take care of that first (or adapt the call to either the csv reader or the template).
The problem is that inputtext.replace(match, row[match]) doesn't change the inputtext variable, it just makes a new string that you aren't storing. Try this:
import copy
with open('recipients.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
inputtext_copy = copy.copy(inputtext) ## make a new copy of the input_text to be changed.
for match in matchedfields:
inputtext_copy = inputtext_copy.replace(match, row[match]) ## this saves the text with the right
print inputtext ## should be the original w/ generic field names
print inputtext_copy ## should be the new version with all fields changed to specific instantiations
You should reassign the replaced string to the original name so the previous replacements are not thrown away:
with open('recipients.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
inputtext = text
for match in matchedfields:
inputtext = inputtext.replace(match, row[match])
print inputtext
On another note, you could update the original string using string formatting with a little modification to the string like so:
text = "Dear {0[NAME]}, it was nice to meet you on {0[DATE]}. Hope to talk with you and {0[SPOUSE]} again soon!"
with open('recipients.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
inputtext = text.format(row)
print inputtext
That formats the string with the dictionary in one step, without having to make replacements iteratively.

Reading and Overwriting .CSV file data

I am attempting to produce a read and write sequence for a python program. I am fairly new to Python; and as a result do not have amazing knowledge of the actual code.
I am struggling with reading data from a .CSV file, this file contains 3 columns, and the amount of rows depends on how many users use the program I have created. Now, I know how to locate rows, but the problem is that it returns the entire row, with all three columns of data within it. So - how do I isolate pieces of data? And subsequently, how do I turn these pieces of data into variables which can be read, written or overwritten.
Please bare in mind that the context of the program is the Computing A453 coursework. Also remember I am not asking you to do my work for me, I have already completed the other 2 tasks, and all the logic and planning for task 3. It's just I only have 2 weeks left until I have to hand this coursework in, and trying to work out the code that can read and overwrite data is extremely hard for a beginner like me.
with open('results.csv', 'a', newline='') as fp:
g = csv.writer(fp, delimiter=',')
data = [[name, score, classroom]]
g.writerows(data)
fp.close()
# Do the reading
result = open('results.csv', 'r')
reader = csv.reader(result)
new_rows_list = []
for row in reader:
if row[2] == name:
if row[2] < score:
result.close()
file2 = open('results.csv', 'wb')
writer = csv.writer(file2)
new_row = [row[2], row[2], name, score, classnumber]
new_rows_list.append(new_row)
file2.close()
At the moment, this code reads the file, but not in the way I want it too. I want it to isolate the "name" of the user on record (within the .csv file). Instead of doing so, it reads the entire row as a whole, which I do not know how to isolate down to just the name of the user.
Here is the data in the .CSV file:
Name Score Class number
jor 0 2
jor 0 2
jor 1 2
I'm assuming what you're getting looks like this:
Jacob, 14, Class B, Number 3
And that that is a string.
If that is the case, String.split() is your answer.
String.split() takes a character as an argument, in your case a comma (Comma Seperated Values), and returns an array of everything in between every instance of that character in the string.
From there, if you want to use the results as data in your program, you should cast the values to the datatype you want (Like float(x) or int(x))
Hope this helped

Writing multiple values in single cell in csv

For each user I have the list of events in which he participated.
e.g. bob : [event1,event2,...]
I want to write it in csv file. I created a dictionary (key - user & value - list of events)
I wrote it in csv. The following is the sample output
username, frnds
"abc" ['event1','event2']
where username is first col and frnds 2nd col
This is code
writer = csv.writer(open('eventlist.csv', 'ab'))
for key, value in evnt_list.items():
writer.writerow([key, value])
when I am reading the csv I am not getting the list directly. But I am getting it in following way
['e','v','e','n','t','1','','...]
I also tried to write the list directly in csv but while reading am getting the same output.
What I want is multiple values in a single cell so that when I read a column for a row I get list of all events.
e.g
colA colB
user1,event1,event2,...
I think it's not difficult but somehow I am not getting it.
###Reading
I am reading it with the help of following
codereader = csv.reader(open("eventlist.csv"))
reader.next()
for row in reader:
tmp=row[1]
print tmp # it is printing the whole list but
print tmp[0] #the output is [
print tmp[1] #output is 'e' it should have been 'event1'
print tmp[2] #output is 'v' it should have been 'event2'
you have to format your values into a single string:
with open('eventlist.csv', 'ab') as f:
writer = csv.writer(f, delimiter=' ')
for key, value in evnt_list.items():
writer.writerow([key, ','.join(value)])
exports as
key1 val11,val12,val13
key2 val21,val22,val23
READING: Here you have to keep in mind, that you converted your Python list into a formatted string. Therefore you cannot use standard csv tools to read it:
with open("eventlist.csv") as f:
csvr = csv.reader(f, delimiter=' ')
csvr.next()
for rec in csvr:
key, values_txt = rec
values = values_txt.split(',')
print key, values
works as awaited.
You seem to be saying that your evnt_list is a dictionary whose keys are strings and whose values are lists of strings. If so, then the CSV-writing code you've given in your question will write a string representation of a Python list into the second column. When you read anything in from CSV, it will just be a string, so once again you'll have a string representation of your list. For example, if you have a cell that contains "['event1', 'event2']" you will be reading in a string whose first character (at position 0) is [, second character is ', third character is e, etc. (I don't think your tmp[1] is right; I think it is really ', not e.)
It sounds like you want to reconstruct the Python object, in this case a list of strings. To do that, use ast.literal_eval:
import ast
cell_string_value = "['event1', 'event2']"
cell_object = ast.literal_eval(cell_string_value)
Incidentally, the reason to use ast.literal_eval instead of just eval is safety. eval allows arbitrary Python expressions and is thus a security risk.
Also, what is the purpose of the CSV, if you want to get the list back as a list? Will people be reading it (in Excel or something)? If not, then you may want to simply save the evnt_list object using pickle or json, and not bother with the CSV at all.
Edit: I should have read more carefully; the data from evnt_list is being appended to the CSV, and neither pickle nor json is easily appendable. So I suppose CSV is a reasonable and lightweight way to accumulate the data. A full-blown database might be better, but that would not be as lightweight.

Searching through data in txt files

I'm teaching myself python and wanted to learn how to search through text files. For example, i've got a long list of full names and addresses, and want to be able to type in a first name and then print the details corresponding to that name. What would be the best way to go about this? Thanks!
The data I have is in a .txt file in columns like this:
Doe, John London
Doe, Jane Paris
If you've designed the data format, fixed-width columns aren't a very good one. But if you're stuck with them, they're easy to deal with.
First, you want to parse your data:
addressbook = []
with open('addressbook.txt', 'r') as f:
for line in f:
name, city = line[:17], line[17:]
last, first = name.split(',')
addressbook.append((first, last, city))
But now, you want to be able to search by first name. You can do that, but it might be slow for a huge addressbook, and the code won't be very direct:
def printDetails(addressbook, firstname):
for (first, last, city) in addressbook:
if first == firstname:
print fist, last, city
What if, instead of just a list of tuples, we used a dictionary, mapping first names to the other field?
addressbook = {}
with open('addressbook.txt', 'r') as f:
for line in f:
name, city = line[:17], line[17:]
last, first = name.split(',')
addressbook[first]=((last, city))
But that's no good—each new "John" will erase any previous "John". So what we really want is a dictionary, mapping first names to lists of tuples:
addressbook = collections.defaultdict(list)
with open('addressbook.txt', 'r') as f:
for line in f:
name, city = line[:17], line[17:]
last, first = name.split(',')
addressbook[first].append((last, city))
Now, if I want to see the details for that first name:
def printDetails(addressbook, firstname):
for (last, city) in addressbook[firstname]:
print firstname, last, city
Whichever way you go, there are a few obvious places to improve this. For example, you may notice that some of the fields have extra spaces at the start or end. How would you get rid of those? If you call printDetails on "Joe" and there is no "Joe", you get nothing at all; maybe a nice error message would be better. But once you've got the basics working, you can always add more later.
I would make judicious use of the split command. It depends on how your file is delimited, of course, but your example shows that the characters splitting the data fields are spaces.
For each line in the file, do something like this:
last, first, city = [data.strip(',') for data in line.split(' ') if data]
And then run your comparison based on those attributes.
Obviously, this will break if your data fields have spaces in them, so ensure that's not the case before you take a simple approach like this.
To read a text-file in python, you do something like this:
f = open('yourtextfile.txt')
for line in f:
//The for-loop will loop thru the whole file line by line
//Now you can do what you want to the line, in your example
//You want to extract the first and last name and the city
You could do something as simple as this:
name = raw_input('Type in a first name: ') # name to search for
with open('x.txt', 'r') as f: # 'r' means we only intend to read
for s in f:
if s.split()[1] == name: # s.split()[1] will return the first name
print s
break # end the loop once we've found a match
else:
print 'Name not found.' # this will be executed if no match is found
Type in a first name: Jane
Doe, Jane Paris
Relevant documentation
Reading and Writing Files
open

Categories