Writing Python CSV files - encoding

Writing Python CSV files - encoding - python

I open a .csv file and I write another .csv file in output.
I specified encoding='utf-8' for both files.
When I read the input file, in a dictionary, I have an accented character (ì) which I can see in the variables I use, but "ì" becomes "Ã¬" when I write it in the output file.
I create the output line by concatenating some variables, like this:
output_line = [name, address, citizenship_flag]
citizenship_flag may be "sì" or "no".
In the output file it becomes "Ã¬".
Where am I wrong?
Thanks.

Related

How do I increase the default column width of a csv file so that when I open the file all of the text fits correctly?

I am trying to code a function where I grab data from my database, which already works correctly.
This is my code for the headers prior to adding the actual records:
with open('csv_template.csv', 'a') as template_file:
#declares the variable template_writer ready for appending
template_writer = csv.writer(template_file, delimiter=',')
#appends the column names of the excel table prior to adding the actual physical data
template_writer.writerow(['Arrangement_ID','Quantity','Cost'])
#closes the file after appending
template_file.close()
This is my code for the records which is contained in a while loop and is the main reason that the two scripts are kept separate.
with open('csv_template.csv', 'a') as template_file:
#declares the variable template_writer ready for appending
template_writer = csv.writer(template_file, delimiter=',')
#appends the data of the current fetched values of the sql statement within the while loop to the csv file
template_writer.writerow([transactionWordData[0],transactionWordData[1],transactionWordData[2]])
#closes the file after appending
template_file.close()
Now once I have got this data ready for excel, I run the file in excel and I would like it to be in a format where I can print immediately, however, when I do print the column width of the excel cells is too small and leads to it being cut off during printing.
I have tried altering the default column width within excel and hoping that it would keep that format permanently but that doesn't seem to be the case and every time that I re-open the csv file in excel it seems to reset completely back to the default column width.
Here is my code for opening the csv file in excel using python and the comment is the actual code I want to use when I can actually format the spreadsheet ready for printing.
#finds the os path of the csv file depending where it is in the file directories
file_path = os.path.abspath("csv_template.csv")
#opens the csv file in excel ready to print
os.startfile(file_path)
#os.startfile(file_path, 'print')
If anyone has any solutions to this or ideas please let me know.

Unfortunately I don't think this is possible for CSV file formats, since they are just plaintext comma separated values and don't support formatting.
I have tried altering the default column width within excel but every time that I re-open the csv file in excel it seems to reset back to the default column width.
If you save the file to an excel format once you have edited it that should solve this problem.
Alternatively, instead of using the csv library you could use xlsxwriter instead which does allow you to set the width of the columns in your code.
See https://xlsxwriter.readthedocs.io and https://xlsxwriter.readthedocs.io/worksheet.html#worksheet-set-column.
Hope this helps!

The csv format is nothing else than a text file, where the lines follow a given pattern, that is, a fixed number of fields (your data) delimited by comma. In contrast an .xlsx file is a binary file that contains specifications about the format. Therefore you may want write to an Excel file instead using the rich pandas library.

You can add space like as it is string so it will automatically adjust the width do it like this:
template_writer.writerow(['Arrangement_ID ','Quantity ','Cost '])

How do I convert an Excel or similar (not just a plain text file) to bytes or binary then back again

I have written an encryption program that I want to be able to encrypt Excel files with and then decrypt them and output a final Excel file. I have decided to read the whole file then encrypt it as it would be easier than reading each cell from the Excel file.
So far I have been able to read the file and convert it to bytes but cannot figure out how to turn it back into an Excel file.
root = Tk()
root.withdraw()
file = filedialog.askopenfile(initialdir="C:")##Creates a file dialog to pick a file
tempFile=open(file.name,encoding="Latin-1")##Encodes it with Latin-1 so all 256 bytes can be read
file.close()
data=tempFile.read()
tempFile.close()
newFile=open("testfile","w",encoding="cp1252")##Creates a new file with cp1252 encoding as that is what Excel uses
newFile.write(data)
newFile.close() ##Currently it just fills the Excel file with a whole bunch of random characters
Edit:
To be more concise, what I want to do is take the data from an Excel file with anything in it, encrypt it, decrypt it and then write it back into a new Excel file with all formatting etc intact. Is there a way to do the reading and writing of the whole file?

I have found the answer to my problem, #furas your comment is what I needed to do:
selectedFile = filedialog.askopenfile(initialdir="C:")
tempFile=open(selectedFile.name,mode="rb")
data=tempFile.read()
newFile=open("test.xlsx",mode="wb")
newFile.write(data)
This creates the exact same file as the original one.

Reading from file. Input character \ becomes \\ in file output. Best approach?

I have a Python script that is reading files as input which contains \ characters, alters some of the content, and writes it to another output file. A simplified version of this script looks like this:
inputFile = open(sys.argv[1], 'r')
input = inputFile.read()
outputFile = open(sys.argv[2], 'w')
outputFile.write(input.upper())
Given, this content in input file:
My name\'s Bob
the output is:
MY NAME\\'S BOB
instead of:
MY NAME'S BOB
I suspect that this is because of the input file's format because direct string input yields desirable results (e.g. outputFile.write(('My name\'s Bob').upper())). This does not occur for all files (e.g. .txt files work, but .js files don't). Because I am reading different files as text files, the ideal solution should not require that input file be of certain type, so is there a better way to read files? This leads me to question whether I should use different read/write functions.
Thanks in advance for all help

Printing a txt file inside python shell

I have an assignment for class that has me transfer txt data from excel and execute in python. But every time I run it, only hex is displayed. I was wondering how to have the data displayed in ascii in the shell. This is the code I have so far. Is it possible to print it out in ascii in the shell?
infile = open("data.txt", 'r')
listName = [line.rstrip() for line in infile]
print (listName)
infile.close()

The reason its not working is because you are opening an Excel file - which is in a special format and is not a plain text file.
You can test this by yourself by opening the file in a text editor like Notepad; and you'll see the contents aren't in text.
To open the file and read its contents in Python you will need to do one of these two things:
Open the file in Excel, then save it as a text file (or a comma separated file CSV). Keep in mind if you do this, then you can only save one sheet at a time.
Use a module like pyexcel which will allow you to read the Excel file correctly in Python.
Just opening the file as plain text (or changing its extension) doesn't convert it.

reading from multiple txt files - strip data and save to xls

i'm very new to python, so far i have written the following code below, which allows me to search for text files in a folder, then read all the lines from it, open an excel file and save the read lines in it. (Im still unsure whether this does it for all the text files one by one)
Having run this, i only see the file text data being read and saved into the excel file (first column). Or it could be that it is overwriting the the data from multiple text files into the same column until it finishes.
Could anyone point me in the right direction on how to get it to write the stripped data to the next available column in excel through each text file?
import os
import glob
list_of_files = glob.glob('./*.txt')
for fileName in list_of_files:
fin = open( fileName, "r" )
data_list = fin.readlines()
fin.close() # closes file
del data_list[0:17]
del data_list[1:27] # [*:*]
fout = open("stripD.xls", "w")
fout.writelines(data_list)
fout.flush()
fout.close()

Can be condensed in
import glob
list_of_files = glob.glob('./*.txt')
with open("stripD.xls", "w") as fout:
for fileName in list_of_files:
data_list = open( fileName, "r" ).readlines()
fout.write(data_list[17])
fout.writelines(data_list[44:])
Are you aware that writelines() doesn't introduce newlines ? readlines() keeps newlines during a reading, so there are newlines present in the elements of data_list written in the file by writelines() , but this latter doesn't introduce newlines itself

You may like to check this and for simple needs also csv.

These lines are "interesting":
del data_list[0:17]
del data_list[1:27] # [*:*]
You are deleting as many of the first 17 lines of your input file as exist, keeping the 18th (if it exists), deleting another 26 (if they exist), and keeping any following lines. This is a very unusual procedure, and is not mentioned at all in your description of what you are trying to do.
Secondly, you are writing the output lines (if any) from each to the same output file. At the end of the script, the output file will contain data from only the last input file. Don't change your code to use append mode ... opening and closing the same file all the time just to append records is very wasteful, and only justified if you have a real need to make sure that the data is flushed to disk in case of a power or other failure. Open your output file once, before you start reading files, and close it once when you have finished with all the input files.
Thirdly, any old arbitrary text file doesn't become an "excel file" just because you have named it "something.xls". You should write it with the csv module and name it "something.csv". If you want more control over how Excel will interpret it, write an xls file using xlwt.
Fourthly, you mention "column" several times, but as you have not given any details about how your input lines are to be split into "columns", it is rather difficult to guess what you mean by "next available column". It is even possible to suspect that you are confusing columns and rows ... assuming less than 43 lines in each input file, the 18th ROW of the last input file will be all you will see in the output file.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.