How to do read a .csv file with the following content
$C=2$A=3$B=1$
Then create a new .csv file with the same content but the $ changed into , and sorted alphabetically like the following:
A=3,B=1,C=2
Thank you!
Edit:
Here's my following code. It ended up giving an extra comma at the beginning of the output.
input = csv.reader(open('inputfile.csv','r'), delimiter='$')
output = open('outputfile.csv','w')
try:
writer = csv.writer(output)
for column in input:
writer.writerow(sorted(column))
print (sorted(column))
finally:
out.close()
Right now my input is:
$C=2$A=3$B=1$
and my output is:
,A=3,B=1,C=2
I want it to be:
A=3,B=1,C=2
Thanks!
with open('test.csv') as in_file, open('new.csv', 'w') as out_file:
for line in csv.reader(in_file, delimiter='$'):
out_file.write(','.join(sorted(line)[2:])+'\n')
Basically what this does is:
open the input as in_file
open the output as out_file
initializes a CSV reader with $ as the delimiter using in_file as the input file
iterates through each row doing the following:
sort all of the elements (after parsing)
discard the first 2 (since they'll always be empty strings due to the start/end delimiters on each line)
recombine those elements using , as the delimiter
write that out to the file with a trailing newline \n
edit: fixed for the start/end $ symbols by removing the empty elements that get parsed out of the CSV (the [2:] bit)
You can use a csv.reader to read the file with the delimiter set to '$'. Then for each row returned, strip out the empty elements and sort the rest:
row = sorted([item for item in row if item])
Related
I'm trying to write to a CSV file with output that looks like this:
14897,40.50891,-81.03926,168.19999
but the CSV writer keeps writing the output with quotes at beginning and end
'14897,40.50891,-81.03926,168.19999'
When I print the line normally, the output is correct but I need to do line.split() or else the csv writer puts output as 1,4,8,9,7 etc...
But when I do line.split() the output is then
['14897,40.50891,-81.03926,168.19999']
Which is written as '14897,40.50891,-81.03926,168.19999'
How do I make the quotes go away? I already tried csv.QUOTE_NONE but doesn't work.
with open(results_csv, 'wb') as out_file:
writer = csv.writer(out_file, delimiter=',')
writer.writerow(["time", "lat", "lon", "alt"])
for f in file_directory):
for line in open(f):
print line
line = line.split()
writer.writerow(line)
with line.split(), you're not splitting according to commas but to blanks (spaces, linefeeds, tabs). Since there are none, you end up with only 1 item per row.
Since this item contains commas, csv module has to quote to make the difference with the actual separator (which is also comma). You would need line.strip().split(",") for it to work, but...
using csv to read your data would be a better idea to fix this:
replace that:
for line in open(some_file):
print line
line = line.split()
writer.writerow(line)
by:
with open(some_file) as f:
cr = csv.reader(f) # default separator is comma already
writer.writerows(cr)
You don't need to read the file manually. You can simply use csv reader.
Replace the inner for loop with:
# with ensures that the file handle is closed, after the execution of the code inside the block
with open(some_file) as file:
row = csv.reader(file) # read rows
writer.writerows(row) # write multiple rows at once
I'm looking for a way using python to copy the first column from a csv into an empty file. I'm trying to learn python so any help would be great!
So if this is test.csv
A 32
D 21
C 2
B 20
I want this output
A
D
C
B
I've tried the following commands in python but the output file is empty
f= open("test.csv",'r')
import csv
reader = csv.reader(f,delimiter="\t")
names=""
for each_line in reader:
names=each_line[0]
First, you want to open your files. A good practice is to use the with statement (that, technically speaking, introduces a context manager) so that when your code exits from the with block all the files are automatically closed
with open('test.csv') as inpfile, open('out.csv', 'w') as outfile:
next you want a loop on the lines of the input file (note the indentation, we are inside the with block), line splitting is automatic when you read a text file with lines separated by newlines…
for line in inpfile:
each line is a string, but you think of it as two fields separated by white space — this situation is so common that strings have a method to deal with this situation (note again the increasing indent, we are in the for loop block)
fields = line.split()
by default .split() splits on white space, but you can use, e.g., split(',') to split on commas, etc — that said, fields is a list of strings, for your first record it is equal to ['A', '32'] and you want to output just the first field in this list… for this purpose a file object has the .write() method, that writes a string, just a string, to the file, and fields[0] IS a string, but we have to add a newline character to it because, in this respect, .write() is different from print().
outfile.write(fields[0]+'\n')
That's all, but if you omit my comments it's 4 lines of code
with open('test.csv') as inpfile, open('out.csv', 'w') as outfile:
for line in inpfile:
fields = line.split()
outfile.write(fields[0]+'\n')
When you are done with learning (some) Python, ask for an explanation of this...
with open('test.csv') as ifl, open('out.csv', 'w') as ofl:
ofl.write('\n'.join(line.split()[0] for line in ifl))
Addendum
The csv module in such a simple case adds the additional conveniences of
auto-splitting each line into a list of strings
taking care of the details of output (newlines, etc)
and when learning Python it's more fruitful to see how these steps can be done using the bare language, or at least that it is my opinion…
The situation is different when your data file is complex, has headers, has quoted strings possibly containing quoted delimiters etc etc, in those cases the use of csv is recommended, as it takes into account all the gory details. For complex data analisys requirements you will need other packages, not included in the standard library, e.g., numpy and pandas, but that is another story.
This answer reads the CSV file, understanding a column to be demarked by a space character. You have to add the header=None otherwise the first row will be taken to be the header / names of columns.
ss is a slice - the 0th column, taking all rows as denoted by :
The last line writes the slice to a new filename.
import pandas as pd
df = pd.read_csv('test.csv', sep=' ', header=None)
ss = df.ix[:, 0]
ss.to_csv('new_path.csv', sep=' ', index=False)
import csv
reader = csv.reader(open("test.csv","rb"), delimiter='\t')
writer = csv.writer(open("output.csv","wb"))
for e in reader:
writer.writerow(e[0])
The best you can do is create a empty list and append the column and then write that new list into another csv for example:
import csv
def writetocsv(l):
#convert the set to the list
b = list(l)
print (b)
with open("newfile.csv",'w',newline='',) as f:
w = csv.writer(f, delimiter=',')
for value in b:
w.writerow([value])
adcb_list = []
f= open("test.csv",'r')
reader = csv.reader(f,delimiter="\t")
for each_line in reader:
adcb_list.append(each_line)
writetocsv(adcb_list)
hope this works for you :-)
I have a .csv formatted .txt file. I am deliberating over the best manner in which to .capitalize the text in the first column.
.capitalize() is a string method, so I considered the following; I would need to open the file, convert the data to a list of strings, capitalize the the required word and finally write the data back to file.
To achieve this, I did the following:
newGuestList = []
with open("guestList.txt","r+") as guestFile :
guestList = csv.reader(guestFile)
for guest in guestList :
for guestInfo in guest :
capitalisedName = guestInfo.capitalize()
newGuestList.append(capitalisedName)
Which gives the output:
[‘Peter’, ‘35’, ‘ spain’, ‘Caroline’, ‘37’, ‘france’, ‘Claire’,’32’, ‘ sweden’]
The problem:
Firstly; in order to write this new list back to file, I will need to convert it to a string. I can achieve this using the .join method. However, how can I introduce a newline, \n, after every third word (the country) so that each guest has their own line in the text file?
Secondly; this method, of nested for loops etc. seems highly convoluted, is there a cleaner way?
My .txt file:
peter, 35, spain\n
caroline, 37, france\n
claire, 32, sweden\n
You don't need to split the lines, since the first caracter of the first word is the first caracter of the line :
with open("lst.txt","r") as guestFile :
lines=guestFile.readlines()
newlines=[line.capitalize() for line in lines]
with open("lst.txt","w") as guestFile :
guestFile.writelines(newlines)
You can just use a CSV reader and writer and access the element you want to capitalize from the list.
import csv
import os
inp = open('a.txt', 'r')
out = open('b.txt', 'w')
reader = csv.reader(inp)
writer = csv.writer(out)
for row in reader:
row[0] = row[0].capitalize()
writer.writerow(row)
inp.close()
out.close()
os.rename('b.txt', 'a.txt') # if you want to keep the same name
I am saving a list to a csv using the writerow function from csv module. Something went wrong when I opened the final file in MS office Excel.
Before I encounter this issue, the main problem I was trying to deal with is getting the list saved to each row. It was saving each line into a cell in row1. I made some small changes, now this happened. I am certainly very confused as a novice python guy.
import csv
inputfile = open('small.csv', 'r')
header_list = []
header = inputfile.readline()
header_list.append(header)
input_lines = []
for line in inputfile:
input_lines.append(line)
inputfile.close()
AA_list = []
for i in range(0,len(input_lines)):
if (input_lines[i].split(',')[4]) == 'AA':#column4 has different names including 'AA'
AA_list.append(input_lines[i])
full_list = header_list+AA_list
resultFile = open("AA2013.csv",'w+')
wr = csv.writer(resultFile, delimiter = ',')
wr.writerow(full_list)
Thanks!
UPDATE:
The full_list look like this: ['1,2,3,"MEM",...]
UPDATE2(APR.22nd):
Now I got three cells of data(the header in A1 and the rest in A2 and A3 respectively) in the same row. Apparently, the newline signs are not working for three items in one big list. I think the more specific question now is how do I save a list of records with '\n' behind each record to csv.
UPDATE3(APR.23rd):
original file
Importing the csv module is not enough, you need to use it as well. Right now, you're appending each line as an entire string to your list instead of a list of fields.
Start with
with open('small.csv', 'rb') as inputfile:
reader = csv.reader(inputfile, delimiter=",")
header_list = next(reader)
input_lines = list(reader)
Now header_list contains all the headers, and input_lines contains a nested list of all the rows, each one split into columns.
I think the rest should be pretty straightforward.
append() appends a list at the end of another list. So when you write header_list.append(header), it takes header as a list of characters and appends to header_list. You should write
headers = header.split(',')
header_list.append(headers)
This would split the header row by commas and headers would be the list of header words, then append them properly after header_list.
The same thing goes for AA_list.append(input_lines[i]).
I figured it out.
The different between [val], val, and val.split(",") in the writerow bracket was:
[val]: a string containing everything taking only the first column in excel(header and "2013, 1, 2,..." in A1, B1, C1 and so on ).
val: each letter or comma or space(I forgot the technical terms) take a cell in excel.
val.split(","): comma split the string in [val], and put each string separated by comma into an excel cell.
Here is what I found out: 1.the right way to export the flat list to each line by using with syntax, 2.split the list when writing row
csvwriter.writerow(JD.split())
full_list = header_list+AA_list
with open("AA2013.csv",'w+') as resultFile:
wr = csv.writer(resultFile, delimiter= ",", lineterminator = '\n')
for val in full_list:
wr.writerow(val.split(','))
The wanted output
Please correct my mistakenly used term and syntax! Thanks.
I have my data in my csv file like this:
He has a dog,Allan
She has a cat,Sheena
I want to read it into a list of tuples in Python like this:
[('He has a dog','Allan'),('She has a cat',Sheena')]
My code is :
pos=[]
with open('C:\Python27\listx.csv', 'r') as csvfile:
dreader = csv.reader(csvfile)
for row in dreader:
pos.append(tuple(row))
The output is :
[('He has a dog,Allan',), ('She has a cat,Sheena',)]
As you can see there are two problems:
1.the first entry has to be separated from the second by a ''..It has to be 'He has a dog',Allan' (There is a ' missing after dog and a ' missing before Allan)
2.An unwanted comma at the end of the last character in each tuple..
How do I remove these??..I would appreciate help-been stuck on this for a long time!!!
If your task is to simply split each line by comma, you can forgo the csv reader and try this:
with open("data.csv", "r") as file:
pos = [tuple(s[1:-1] for s in line.strip().split(",")) for line in file]
Note the strip() on each line item - to get rid of leading/trailing whitespace.
Also, a single item in a tuple is displayed with the extra comma such as ("Hello",)
try this:
csv.reader(csvfile, delimiter=',')