Mulitple Lines in a single Excel cell - python

What is the easiest method for writing multple lines into a single cell within excel using python. Ive trying the csv module without success.
import csv
with open('xyz.csv', 'wb') as outfile:
w = csv.writer(outfile)
w.writerow(['stringa','string_multiline',])
Also each of the mutliline stringshave a number of characters in which are typically used for csv`s ie commas.
Any help would be really appreciated.

To figure this out, I created a file in Excel with a single multiline cell.
Then I saved it as CSV and opened it up in a text editor:
"a^Mb"
It looks like Excel interprets Ctrl-M characters as newlines.
Let’s try that with Python:
#!/usr/bin/env python2.7
import csv
with open('xyz.csv', 'wb') as outfile:
w = csv.writer(outfile)
w.writerow(['stringa','multiline\015string',])
Yup, that worked!

you need to pass extra double quotes (") at the start and end of the string. Seperate the different lines of the cell using newline character (\n) .
e.g "line1\nline2\nline3"
f=open("filename.csv","w")
f.write("\"line1\nline2\nline3\"")`
The code creates this csv

Related

How to replace a particular character in a CSV file

I have a folder of CSV files (~100) and every file has an unknows character that looks like this �. This unknown character is supposed to be a double quote ("). Because of this unknown char, I am not able to run my CSV to xlsx converter to convert my files to XLSX format.
I tried using the csv.read() function but it does not with the replace function as csv.read() return a reader object and replace does not work with this. How can I replace that character and write the replaced contents back to csv so that I can run my csv to xlsx converter?
example :
current file contetnts:
"hello�
Output after convertion:
"hello"
Try this:
import fileinput
with fileinput.FileInput("file.csv", inplace=True) as file:
for line in file:
print(line.replace('�', '"'), end='')
The sed command is designed for this kind of work. It finds and replaces characters from a file.
Use this in a terminal.
sed -i 's/old-word/new-word/g' filename.csv
Your old-word should be the unknown character and new-word the double quote
I use this little function to deal with such problems.
The code is quite self-explanatory. It opens a file, read it all (may not work for files larger than your RAM) then rewrites it with a patched version.
def patch_file(file, original, patch):
with open(file, 'r') as f:
lines = f.readlines()
with open(file, 'w') as f:
for line in lines:
f.write(line.replace(original, patch))
patch_file(file='yourCSVfile.txt', original='�', patch'"')

Saving data from python to excel file as CSV UTF-8 file format

I have been trying to save the data as a excel file as a type of CSV UTF-8 (Comma delimited) (*.csv) which is different then the normal
CSV (Comma delimited) (*.csv) file. It display the unicode text when opened in excel. I can save as that file easily from excel but from python i am only able to save it as normal csv. Which will not cause loss of data but when opened it shows this kind of text "à¤à¤‰à¤Ÿà¤¾" instead of "एउटा" this text.
If I copied the text opening it with notepad to the excel file and then manually save the file as CSV UTF-8 then it preserves the correct display. But doing so is time consuming since all values appear in same line in notepad and i have to separate it in excel file.
So i just want to know how can i save data as CSV UTF-8 format of excel using python.
I have tried the follwing code but it results in normal csv file.
import codecs
import unicodecsv as csv
input_text = codecs.open('input.txt', encoding='utf-8')
all_text = input_text.read()
text_list = all_text.split()
output_list = [['Words','Tags']]
for input_word in text_list:
word_tag_list = [input_word,'O']
output_list.append(word_tag_list)
with codecs.open("output.csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(output_list)
You need to indicate to Excel that this is a UTF-8 file. Unfortunately the only way to do this is by prepending a special byte sequence to the front of the file. Python will do this automatically if you use a special encoding.
with codecs.open("output.csv", "w", "encoding="utf_8_sig") as f:
I have found the answer. The encoding="utf_8_sig" should be given to csv.writer method to write the excel file as CSV UTF-8 file. Previous code can be witten as:
with open("output.csv", "wb") as f:
writer = csv.writer(f, dialect='excel', encoding='utf_8_sig')
writer.writerows(output_list)
However there was problem when data has , at the end Eg: "भने," For this case i didn't need the comma so i removed it with following code within the for loop.
import re
if re.search(r'.,$',input_word):
input_word = re.sub(',$','',input_word)
Finally I was able to obtain the output as desired with Unicode character correctly displayed and removing extra comma which is present at the end of data. So, if anyone know how to ignore comma at the end of data in excel file then you can comment here. Thanks.

Add special key as a delimeter for CSV file?

In my csv file the data is separated by a special character. When I view in Notepad++ it shows 'SOH'.
ATT_AI16601A.PV01-Apr-2014 05:02:192.94752310FalseFalseFalse
ATT_AI16601A.PV[]01-Apr-2014 05:02:19[]2.947523[]1[]0[]False[]False[]False[]
It is present in the data but not visible. I have put markers in the second string where those characters are.
My point is that I need to read that data in Python delimited by these markers. How can I use these special characters as delimiters while reading data?
You can use Python csv module by specifying , as delimiter like this.
import csv
reader = csv.reader(file, delimiter='what ever is your delimiter')
In your case
reader = csv.reader(file, delimiter='\x01')
This is because SOH is an ASCII control character with a code point of 1

Read csv file containing escape characters in Python [duplicate]

This question already has answers here:
Process escape sequences in a string in Python
(8 answers)
Closed last month.
Hi and many thanks in advance!
I'm working on a Python script handling utf-8 strings and replacing specific characters. Therefore I use msgText.replace(thePair[0], thePair[1]) while looping trough a list which defines unicode characters and their desired replacement, as shown below.
theList = [
('\U0001F601', '1f601.png'),
('\U0001F602', '1f602.png'), ...
]
Up to here everything works fine. But now consider a csv file which contains the characters to be replaced, as shown below.
\U0001F601;1f601.png
\U0001F602;1f602.png
...
I miserably failed in reading the csv data into the list due to the escape characters. I read the data using the csv module like this:
with open('Data.csv', newline='', encoding='utf-8-sig') as theCSV:
theList=[tuple(line) for line in csv.reader(theCSV, delimiter=';')]
This results in pairs like ('\\U0001F601', '1f601.png') which evade the escape characters (note the double backslash). I tried several methods of modifying the string or other methods of reading the csv data, but I was not able to solve my problem.
How could I accomplish my goal to read csv data into pairs which contain escape characters?
I'm adding the solution for reading csv data containing escape characters for the sake of completeness. Consider a file Data.csv defining the replacement pattern:
\U0001F601;1f601.png
\U0001F602;1f602.png
Short version (using list comprehensions):
import csv
# define replacement list (short version)
with open('Data.csv', newline='', encoding='utf-8-sig') as csvFile:
replList=[(line[0].encode().decode('unicode-escape'), line[1]) \
for line in csv.reader(csvFile, delimiter=';') if line]
csvFile.close()
Prolonged version (probably easier to understand):
import csv
# define replacement list (step by step)
replList=[]
with open('Data.csv', newline='', encoding='utf-8-sig') as csvFile:
for line in csv.reader(csvFile, delimiter=';'):
if line: # skip blank lines
replList.append((line[0].encode().decode('unicode-escape'), line[1]))
csvFile.close()

Csv blank rows problem with Excel

I have a csv file which contains rows from a sqlite3 database. I wrote the rows to the csv file using python.
When I open the csv file with Ms Excel, a blank row appears below every row, but the file on notepad is fine(without any blanks).
Does anyone know why this is happenning and how I can fix it?
Edit: I used the strip() function for all the attributes before writing a row.
Thanks.
You're using open('file.csv', 'w')--try open('file.csv', 'wb').
The Python csv module requires output files be opened in binary mode.
the first that comes into my mind (just an idea) is that you might have used "\r\n" as row delimiter (which is shown as one linebrak in notepad) but excel expects to get only "\n" or only "\r" and so it interprets this as two line-breaks.

Categories