writing to csv but not getting desired formatting - python

I want to remove quoting all together and when I write to my csv file I'm getting an extra \ between the name and ip.
with open('test.csv', 'w') as csvfile:
info = csv.writer(csvfile, quoting=csv.QUOTE_NONE, delimiter=',', escapechar='\\')
for json_object in json_data:
if len(json_object['names']) != 0:
name = json_object['names'][0]
ip = json_object['ip_address']
combined = (name + ',' + ip)
print combined
info.writerow([combined])
this is what I'm seeing in my csv file:
ncs1.aol.net\,10.136.0.2
the formatting i'm trying to achieve is:
ncs1.aol.net,10.136.0.2

You don't need to write the row yourself, thats the point of using csv. Instead of creating combined just use:
info.writerow([name, ip])
Currently you're writing a single item to the row and that means the module is escaping , for you and you get \,.

You could also just strip your combined expression:
combine=combined.strip('\')

Related

Make CSV escape Double Quotation Marks

I need to prepare a .csv file so that a double quotation marks gets ignored by the program processing it (ArcMap). Arc was blending the contents of all following cells on that line into any previous one containing double quotation marks. For example:
...and no further rows would get processed at all.
How does one make a CSV escape Double Quotation Marks for successful processing in ArcMap (10.2)?
Let's say df is the DataFrame created for the csv files as follows
df = pd.read_csv('filename.csv')
Let us assume that comments is the name of the column where the issue occurs, i.e. you want to replace every double quotes (") with a null string ().
The following one-liner does that for you. It will replace every double quotes for every row in df['comments'] with null string.
df['comments'] = df['comments'].apply(lambda x: x.replace('"', ''))
The lambda captures every row in df['comments'] in variable x.
EDIT: To escape the double quotes you need to convert the string to it's raw format. Again another one-liner very similar to the one above.
df['comments'] = df['comments'].apply(lambda x: r'{0}'.format(x))
The r before the string is an escape to escape characters in python.
You could try reading the file with the csv module and writing it back in the hopes that the output format will be more digestible for your other tool. See the docs for formatting options.
import csv
with open('in.csv', 'r') as fin, open('out.csv', 'w') as fout:
reader = csv.reader(fin, delimiter='\t')
writer = csv.writer(fout, delimiter='\t')
# alternative:
# writer = csv.writer(fout, delimiter='\t', escapechar='\\', doublequote=False)
for line in reader:
writer.writerow(line)
What worked for me was writing a module to do some "pre-processing" of the CSV file as follows. The key line is where the "writer" has the parameter "quoting=csv.QUOTE_ALL". Hopefully this is useful to others.
def work(Source_CSV):
from __main__ import *
import csv, arcpy, os
# Derive name and location for newly-formatted .csv file
Head = os.path.split(Source_CSV)[0]
Tail = os.path.split(Source_CSV)[1]
name = Tail[:-4]
new_folder = "formatted"
new_path = os.path.join(Head,new_folder)
Formatted_CSV = os.path.join(new_path,name+"_formatted.csv")
#arcpy.AddMessage("Formatted_CSV = "+Formatted_CSV)
# Populate the new .csv file with quotation marks around all field contents ("quoting=csv.QUOTE_ALL")
with open(Source_CSV, 'rb') as file1, open(Formatted_CSV,'wb') as file2:
# Instantiate the .csv reader
reader = csv.reader(file1, skipinitialspace=True)
# Write column headers without quotes
headers = reader.next() # 'next' function actually begins at the first row of the .csv.
str1 = ''.join(headers)
writer = csv.writer(file2)
writer.writerow(headers)
# Write all other rows wrapped in double quotes
writer = csv.writer(file2, delimiter=',', quoting=csv.QUOTE_ALL)
# Write all other rows, at first quoting none...
#writer = csv.writer(file2, quoting=csv.QUOTE_NONE,quotechar='\x01')
for row in reader:
# ...then manually doubling double quotes and wrapping 3rd column in double quotes.
#row[2] = '"' + row[2].replace('"','""') + '"'
writer.writerow(row)
return Formatted_CSV

Why do some rows in csv file have an invalid format?

I am currently fetching data from an API and I would like to store those data as csv.
However, some lines are always invalid which means I cannot split them via Excel's text-in-columns functionality.
I create the csv file as follows:
with open(directory_path + '/' + file_name + '-data.csv', 'a', newline='') as file:
# Setup a writer
csvwriter = csv.writer(file, delimiter='|')
# Write headline row
if not headline_exists:
csvwriter.writerow(['Title', 'Text', 'Tip'])
# Build the data row
record = data['title'] + '|' + data['text'] + '|' + data['tip']
csvwriter.writerow([record])
If you open the csv file in Excel you also immediately see that the row is invalid. While the valid one takes the default height and the whole width, the invalid one takes more height but less width.
Does anyone know a reason for that problem?
The rows are not invalid, but what you do is.
So first of all: You use pipes as delimeters. Its fine in some scenarios, but given the fact you want to load it into excel immediately it seems wiser to me to export the data in "excel" dialect:
csvwriter = csv.writer(file, dialect='excel')
Second, look at the following lines:
record = data['title'] + '|' + data['text'] + '|' + data['tip']
csvwriter.writerow([record])
This way you basically tell the csv writer that you want a single column, with pipes in it. If you use a csv writer you must not concatenate the delimeters on your own, it voids the point of using a writer. So this is how it should be done instead:
record = [data['title'], data['text'], data['tip']]
csvwriter.writerow(record)
Hope it helps.
I have finally found out that I had to strip the text and the tip because they sometimes contain whitespaces which would break the format.
Additionally, I also followed the recommendation to use the excel dialect since I think this will make it easier to process the data later on.

remove non ascii characters from csv file using Python

I am trying to remove non-ascii characters from a file. I am actually trying to convert a text file which contains these characters (eg. hello§‚å½¢æˆ äº†å¯¹æ¯”ã€‚ 花å) into a csv file.
However, I am unable to iterate through these characters and hence I want to remove them (i.e chop off or put a space). Here's the code (researched and gathered from various sources)
The problem with the code is, after running the script, the csv/txt file has not been updated. Which means the characters are still there. Have absolutely no idea how to go about doing this anymore. Researched for a day :(
Would kindly appreciate your help!
import csv
txt_file = r"xxx.txt"
csv_file = r"xxx.csv"
in_txt = csv.reader(open(txt_file, "rb"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'wb'))
for row in in_txt:
for i in row:
i = "".join([a if ord(a)<128 else''for a in i])
out_csv.writerows(in_txt)
Variable assignment is not magically transferred to the original source; you have to build up a new list of your changed rows:
import csv
txt_file = r"xxx.txt"
csv_file = r"xxx.csv"
in_txt = csv.reader(open(txt_file, "rb"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'wb'))
out_txt = []
for row in in_txt:
out_txt.append([
"".join(a if ord(a) < 128 else '' for a in i)
for i in row
]
out_csv.writerows(out_txt)

Python: writing an entire row to a CSV file. Why does it work this way?

I had exported a csv from Nokia Suite.
"sms","SENT","","+12345678901","","2015.01.07 23:06","","Text"
Reading from the PythonDoc, I tried
import csv
with open(sourcefile,'r', encoding = 'utf8') as f:
reader = csv.reader(f, delimiter = ',')
for line in reader:
# write entire csv row
with open(filename,'a', encoding = 'utf8', newline='') as t:
a = csv.writer(t, delimiter = ',')
a.writerows(line)
It didn't work, until I put brackets around 'line' as so i.e. [line].
So at the last part I had
a.writerows([line])
Why is that so?
The writerows method accepts a container object. The line object isn't a container. [line] turns it into a list with one item in it.
What you probably want to use instead is writerow.

Can't get a desired output with csv.writer in python

I spent all day trying to find a solution for this problem and as I didn't find one yet I'm coming here for you help!
I'm writing a CSV file exporter on python and I can't get it to output exactly how I want it to
let's say I have a list of strings containing several tutorial titles:
tutorials = ['tut1', 'tut2, 'tut3']
and I want to print in the csv file's first line like this:
First Name, Surname, StudentID, tut1, tut2, tut3
Now, this is the code I wrote to do so:
tutStr= ' '
for i tutorials:
tutStr += i + ', '
tutStr = tutStr[0:-3]
with open('CSVexport.csv', 'wb') as csvFile:
writer = csv.writer(csvFile, delimiter=',', quoting=csv.QUOTE_NONE)
writer.writerow(['First Name'+'Surname'+'ID number'+tutStr)
This gives me the error:
"writer.writerow(['First Name'+'Surname'+'ID number'+labString])
Error: need to escape, but no escapechar set"
now, I know that the error has to do with csv.QUOTE_NONE and that I have to set an escapechar but I really don't understand how am I supposed to do so.
if I try without the QUOTE_NONE it'll automaticalle set the quotechar to ' " ' which will give me this output
"First Name, Surname, StudentID, tut1, tut2, tut3" --> don't want quotation marks!!!
Any idea?
If you are going to write something with writerow you need to pass a list there, not a list containing one big string separated by commas (it will appear as a string in one cell). quoting=csv.QUOTE_MINIMAL would help to quote strings that need to be quoted.
tutorials = ['tut1', 'tut2', 'tut3']
headers = ['First Name', 'Surname', 'ID number'] + tutorials
first_line = True
with open('CSVexport.csv', 'w') as csvFile:
writer = csv.writer(csvFile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
if first_line:
writer.writerow(headers)
first_line = False

Categories