Problme in reading list of tuples into Python - python

I have my data in my csv file like this:
He has a dog,Allan
She has a cat,Sheena
I want to read it into a list of tuples in Python like this:
[('He has a dog','Allan'),('She has a cat',Sheena')]
My code is :
pos=[]
with open('C:\Python27\listx.csv', 'r') as csvfile:
dreader = csv.reader(csvfile)
for row in dreader:
pos.append(tuple(row))
The output is :
[('He has a dog,Allan',), ('She has a cat,Sheena',)]
As you can see there are two problems:
1.the first entry has to be separated from the second by a ''..It has to be 'He has a dog',Allan' (There is a ' missing after dog and a ' missing before Allan)
2.An unwanted comma at the end of the last character in each tuple..
How do I remove these??..I would appreciate help-been stuck on this for a long time!!!

If your task is to simply split each line by comma, you can forgo the csv reader and try this:
with open("data.csv", "r") as file:
pos = [tuple(s[1:-1] for s in line.strip().split(",")) for line in file]
Note the strip() on each line item - to get rid of leading/trailing whitespace.
Also, a single item in a tuple is displayed with the extra comma such as ("Hello",)

try this:
csv.reader(csvfile, delimiter=',')

Related

Python CSV writer keeps adding unnecessary quotes

I'm trying to write to a CSV file with output that looks like this:
14897,40.50891,-81.03926,168.19999
but the CSV writer keeps writing the output with quotes at beginning and end
'14897,40.50891,-81.03926,168.19999'
When I print the line normally, the output is correct but I need to do line.split() or else the csv writer puts output as 1,4,8,9,7 etc...
But when I do line.split() the output is then
['14897,40.50891,-81.03926,168.19999']
Which is written as '14897,40.50891,-81.03926,168.19999'
How do I make the quotes go away? I already tried csv.QUOTE_NONE but doesn't work.
with open(results_csv, 'wb') as out_file:
writer = csv.writer(out_file, delimiter=',')
writer.writerow(["time", "lat", "lon", "alt"])
for f in file_directory):
for line in open(f):
print line
line = line.split()
writer.writerow(line)
with line.split(), you're not splitting according to commas but to blanks (spaces, linefeeds, tabs). Since there are none, you end up with only 1 item per row.
Since this item contains commas, csv module has to quote to make the difference with the actual separator (which is also comma). You would need line.strip().split(",") for it to work, but...
using csv to read your data would be a better idea to fix this:
replace that:
for line in open(some_file):
print line
line = line.split()
writer.writerow(line)
by:
with open(some_file) as f:
cr = csv.reader(f) # default separator is comma already
writer.writerows(cr)
You don't need to read the file manually. You can simply use csv reader.
Replace the inner for loop with:
# with ensures that the file handle is closed, after the execution of the code inside the block
with open(some_file) as file:
row = csv.reader(file) # read rows
writer.writerows(row) # write multiple rows at once

Python: How to capitalize the first column of a .txt file.

I have a .csv formatted .txt file. I am deliberating over the best manner in which to .capitalize the text in the first column.
.capitalize() is a string method, so I considered the following; I would need to open the file, convert the data to a list of strings, capitalize the the required word and finally write the data back to file.
To achieve this, I did the following:
newGuestList = []
with open("guestList.txt","r+") as guestFile :
guestList = csv.reader(guestFile)
for guest in guestList :
for guestInfo in guest :
capitalisedName = guestInfo.capitalize()
newGuestList.append(capitalisedName)
Which gives the output:
[‘Peter’, ‘35’, ‘ spain’, ‘Caroline’, ‘37’, ‘france’, ‘Claire’,’32’, ‘ sweden’]
The problem:
Firstly; in order to write this new list back to file, I will need to convert it to a string. I can achieve this using the .join method. However, how can I introduce a newline, \n, after every third word (the country) so that each guest has their own line in the text file?
Secondly; this method, of nested for loops etc. seems highly convoluted, is there a cleaner way?
My .txt file:
peter, 35, spain\n
caroline, 37, france\n
claire, 32, sweden\n
You don't need to split the lines, since the first caracter of the first word is the first caracter of the line :
with open("lst.txt","r") as guestFile :
lines=guestFile.readlines()
newlines=[line.capitalize() for line in lines]
with open("lst.txt","w") as guestFile :
guestFile.writelines(newlines)
You can just use a CSV reader and writer and access the element you want to capitalize from the list.
import csv
import os
inp = open('a.txt', 'r')
out = open('b.txt', 'w')
reader = csv.reader(inp)
writer = csv.writer(out)
for row in reader:
row[0] = row[0].capitalize()
writer.writerow(row)
inp.close()
out.close()
os.rename('b.txt', 'a.txt') # if you want to keep the same name

Trouble in saving a list to csv

I am saving a list to a csv using the writerow function from csv module. Something went wrong when I opened the final file in MS office Excel.
Before I encounter this issue, the main problem I was trying to deal with is getting the list saved to each row. It was saving each line into a cell in row1. I made some small changes, now this happened. I am certainly very confused as a novice python guy.
import csv
inputfile = open('small.csv', 'r')
header_list = []
header = inputfile.readline()
header_list.append(header)
input_lines = []
for line in inputfile:
input_lines.append(line)
inputfile.close()
AA_list = []
for i in range(0,len(input_lines)):
if (input_lines[i].split(',')[4]) == 'AA':#column4 has different names including 'AA'
AA_list.append(input_lines[i])
full_list = header_list+AA_list
resultFile = open("AA2013.csv",'w+')
wr = csv.writer(resultFile, delimiter = ',')
wr.writerow(full_list)
Thanks!
UPDATE:
The full_list look like this: ['1,2,3,"MEM",...]
UPDATE2(APR.22nd):
Now I got three cells of data(the header in A1 and the rest in A2 and A3 respectively) in the same row. Apparently, the newline signs are not working for three items in one big list. I think the more specific question now is how do I save a list of records with '\n' behind each record to csv.
UPDATE3(APR.23rd):
original file
Importing the csv module is not enough, you need to use it as well. Right now, you're appending each line as an entire string to your list instead of a list of fields.
Start with
with open('small.csv', 'rb') as inputfile:
reader = csv.reader(inputfile, delimiter=",")
header_list = next(reader)
input_lines = list(reader)
Now header_list contains all the headers, and input_lines contains a nested list of all the rows, each one split into columns.
I think the rest should be pretty straightforward.
append() appends a list at the end of another list. So when you write header_list.append(header), it takes header as a list of characters and appends to header_list. You should write
headers = header.split(',')
header_list.append(headers)
This would split the header row by commas and headers would be the list of header words, then append them properly after header_list.
The same thing goes for AA_list.append(input_lines[i]).
I figured it out.
The different between [val], val, and val.split(",") in the writerow bracket was:
[val]: a string containing everything taking only the first column in excel(header and "2013, 1, 2,..." in A1, B1, C1 and so on ).
val: each letter or comma or space(I forgot the technical terms) take a cell in excel.
val.split(","): comma split the string in [val], and put each string separated by comma into an excel cell.
Here is what I found out: 1.the right way to export the flat list to each line by using with syntax, 2.split the list when writing row
csvwriter.writerow(JD.split())
full_list = header_list+AA_list
with open("AA2013.csv",'w+') as resultFile:
wr = csv.writer(resultFile, delimiter= ",", lineterminator = '\n')
for val in full_list:
wr.writerow(val.split(','))
The wanted output
Please correct my mistakenly used term and syntax! Thanks.

Remove a specific row in a csv file with python

I am trying to remove a row from a csv file if the 2nd column matches a string. My csv file has the following information:
Name
15 Dog
I want the row with "Name" in it removed. The code I am using is:
import csv
reader = csv.reader(open("info.csv", "rb"), delimiter=',')
f = csv.writer(open("final.csv", "wb"))
for line in reader:
if "Name" not in line:
f.writerow(line)
print line
But the "Name" row isn't removed. What am I doing wrong?
EDIT: I was using the wrong delimiter. Changing it to \t worked. Below is the code that works now.
import csv
reader = csv.reader(open("info.csv", "rb"), delimiter='\t')
f = csv.writer(open("final.csv", "wb"))
for line in reader:
if "Name" not in line:
f.writerow(line)
print line
Seems that you are specifying the wrong delimiter (comma)in csv.reader
Each line yielded by reader is a list, split by your delimiter. Which, by the way, you specified as ,, are you sure that is the delimiter you want? Your sample is delimited by tabs.
Anyway, you want to check if 'Name' is in any element of a given line. So this will still work, regardless of whether your delimiter is correct:
for line in reader:
if any('Name' in x for x in line):
#write operation
Notice the difference. This version checks for 'Name' in each list element, yours checks if 'Name' is in the list. They are semantically different because 'Name' in ['blah blah Name'] is False.
I would recommend first fixing the delimiter error. If you still have issues, use if any(...) as it is possible that the exact token 'Name' is not in your list, but something that contains 'Name' is.

Sorting CSV file with delimiter in Python

How to do read a .csv file with the following content
$C=2$A=3$B=1$
Then create a new .csv file with the same content but the $ changed into , and sorted alphabetically like the following:
A=3,B=1,C=2
Thank you!
Edit:
Here's my following code. It ended up giving an extra comma at the beginning of the output.
input = csv.reader(open('inputfile.csv','r'), delimiter='$')
output = open('outputfile.csv','w')
try:
writer = csv.writer(output)
for column in input:
writer.writerow(sorted(column))
print (sorted(column))
finally:
out.close()
Right now my input is:
$C=2$A=3$B=1$
and my output is:
,A=3,B=1,C=2
I want it to be:
A=3,B=1,C=2
Thanks!
with open('test.csv') as in_file, open('new.csv', 'w') as out_file:
for line in csv.reader(in_file, delimiter='$'):
out_file.write(','.join(sorted(line)[2:])+'\n')
Basically what this does is:
open the input as in_file
open the output as out_file
initializes a CSV reader with $ as the delimiter using in_file as the input file
iterates through each row doing the following:
sort all of the elements (after parsing)
discard the first 2 (since they'll always be empty strings due to the start/end delimiters on each line)
recombine those elements using , as the delimiter
write that out to the file with a trailing newline \n
edit: fixed for the start/end $ symbols by removing the empty elements that get parsed out of the CSV (the [2:] bit)
You can use a csv.reader to read the file with the delimiter set to '$'. Then for each row returned, strip out the empty elements and sort the rest:
row = sorted([item for item in row if item])

Categories