Csv blank rows problem with Excel - python

I have a csv file which contains rows from a sqlite3 database. I wrote the rows to the csv file using python.
When I open the csv file with Ms Excel, a blank row appears below every row, but the file on notepad is fine(without any blanks).
Does anyone know why this is happenning and how I can fix it?
Edit: I used the strip() function for all the attributes before writing a row.
Thanks.

You're using open('file.csv', 'w')--try open('file.csv', 'wb').
The Python csv module requires output files be opened in binary mode.

the first that comes into my mind (just an idea) is that you might have used "\r\n" as row delimiter (which is shown as one linebrak in notepad) but excel expects to get only "\n" or only "\r" and so it interprets this as two line-breaks.

Related

Huge txt file with one column (text to columns in python)

I'm struggeling with one task that can save plenty of time. I'm new to Python so please don't kill me :)
I've got huge txt file with millions of records. I used to split them in MS Access, delimiter "|", filtered data so I can have about 400K records and then copied to Excel.
So basically file looks like:
What I would like to have:
I'm using Spyder so it would be great to see data in variable explorer so I can easily check and (after additional filters) export it to excel.
I use LibreOffice so I'm not 100% sure about Excel but if you change the .txt to .csv and try to open the file with Excel, it should allow to change the delimiter from a comma to '|' and then import it directly. That work with LibreOffice Calc anyway.
u have to split the file in lines then split the lines by the char l and map the data to a list o dicts.
with open ('filename') as file:
data = [{'id': line[0], 'fname':line[1]} for line in f.readlines()]
you have to fill in tve rest of the fields
Doing this with pandas will be much easier
Note: I am assuming that each entry is on a new line.
import pandas as pd
data = pd.read_csv("data.txt", delimiter='|')
# Do something here or let it be if you want to just convert text file to excel file
data.to_excel("data.xlsx")

How do I increase the default column width of a csv file so that when I open the file all of the text fits correctly?

I am trying to code a function where I grab data from my database, which already works correctly.
This is my code for the headers prior to adding the actual records:
with open('csv_template.csv', 'a') as template_file:
#declares the variable template_writer ready for appending
template_writer = csv.writer(template_file, delimiter=',')
#appends the column names of the excel table prior to adding the actual physical data
template_writer.writerow(['Arrangement_ID','Quantity','Cost'])
#closes the file after appending
template_file.close()
This is my code for the records which is contained in a while loop and is the main reason that the two scripts are kept separate.
with open('csv_template.csv', 'a') as template_file:
#declares the variable template_writer ready for appending
template_writer = csv.writer(template_file, delimiter=',')
#appends the data of the current fetched values of the sql statement within the while loop to the csv file
template_writer.writerow([transactionWordData[0],transactionWordData[1],transactionWordData[2]])
#closes the file after appending
template_file.close()
Now once I have got this data ready for excel, I run the file in excel and I would like it to be in a format where I can print immediately, however, when I do print the column width of the excel cells is too small and leads to it being cut off during printing.
I have tried altering the default column width within excel and hoping that it would keep that format permanently but that doesn't seem to be the case and every time that I re-open the csv file in excel it seems to reset completely back to the default column width.
Here is my code for opening the csv file in excel using python and the comment is the actual code I want to use when I can actually format the spreadsheet ready for printing.
#finds the os path of the csv file depending where it is in the file directories
file_path = os.path.abspath("csv_template.csv")
#opens the csv file in excel ready to print
os.startfile(file_path)
#os.startfile(file_path, 'print')
If anyone has any solutions to this or ideas please let me know.
Unfortunately I don't think this is possible for CSV file formats, since they are just plaintext comma separated values and don't support formatting.
I have tried altering the default column width within excel but every time that I re-open the csv file in excel it seems to reset back to the default column width.
If you save the file to an excel format once you have edited it that should solve this problem.
Alternatively, instead of using the csv library you could use xlsxwriter instead which does allow you to set the width of the columns in your code.
See https://xlsxwriter.readthedocs.io and https://xlsxwriter.readthedocs.io/worksheet.html#worksheet-set-column.
Hope this helps!
The csv format is nothing else than a text file, where the lines follow a given pattern, that is, a fixed number of fields (your data) delimited by comma. In contrast an .xlsx file is a binary file that contains specifications about the format. Therefore you may want write to an Excel file instead using the rich pandas library.
You can add space like as it is string so it will automatically adjust the width do it like this:
template_writer.writerow(['Arrangement_ID ','Quantity ','Cost '])

Python csv reader wrongly reading multiple rows simultaneously

I have a simple python program reading a 'coma' delimited csv file with nearly 2000 rows. The program simply reads each line and displays it. I have manually verified each line through vi editor. When reading the csv file the program in the middle of the file somehow ignores the '\n' and reads multiple lines simultaneously into a single row. Why is that happening? I am working on unix platform and have converted the file to unix format as well.
import csv
with open("fileName.csv") as f:
reader=csv.reader(f)
for row in reader:
print row
Any clues?
thanks

Excel fails to open Python-generated CSV files

I have many Python scripts that output CSV files. It is occasionally convenient to open these files in Excel. After installing OS X Mavericks, Excel no longer opens these files properly: Excel doesn't parse the files and it duplicates the rows of the file until it runs out of memory. Specifically, when Excel attempts to open the file, a prompt appears that reads: "File not loaded completely."
Example of code I'm using to generate the CSV files:
import csv
with open('csv_test.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerow([1,2,3])
writer.writerow([4,5,6])
Even the simple file generated by the above code fails to load in Excel. However, if I open the CSV file in a text editor and copy/paste the text into Excel, parse it with text to columns, and then save as CSV from Excel, then I can reopen the CSV file in Excel without issue. Do I need to pass an additional parameter in my scripts to make Excel parse the CSV files the same way it used to? Or is there some setting I can change in OS X Mavericks or Excel? Thanks.
Maybe I had the similar problem, the error message "SYLK: File format is not valid" when open python autogenerated csv file. The solution is really funny. The first two characters must not be I and D in uppercase (ID). Also see "SYLK: File format is not valid" error message when you open file.
Possible solution1: use *.txt instead of *.csv. In this case Excel (at least, 2010) will show you an import data wizard where you can specify delimiters, character encoding, field types, etc.
UPD: Solution2:
The python "csv" module has a "dialect" feature. For example, the following modification of your code generates valid csv file for my environment (Python 2.7, Excel 2010, Windows7, locale with ";" list delimiters):
import csv
with open('csv_test2.csv', 'wb') as f:
csv.excel.delimiter=';'
writer = csv.writer(f, dialect=csv.excel)
writer.writerow([1,2,3])
writer.writerow([4,5,6])

Get the inputs from Excel and use those inputs in python script

How to get the inputs from excel and use those inputs in python.
Take a look at xlrd
This is the best reference I found for learning how to use it: http://www.dev-explorer.com/articles/excel-spreadsheets-and-python
Not sure if this is exactly what you're talking about, but:
If you have a very simple excel file (i.e. basically just one table filled with string-values, nothing fancy), and all you want to do is basic processing, then I'd suggest just converting it to a csv (comma-seperated value file). This can be done by "saving as..." in excel and selecting csv.
This is just a file with the same data as the excel, except represented by lines seperated with commas:
cell A:1, cell A:2, cell A:3
cell B:1, cell B:2, cell b:3
This is then very easy to parse using standard python functions (i.e., readlines to get each line of the file, then it's just a list that you can split on ",").
This if of course only helpful in some situations, like when you get a log from a program and want to quickly run a python script which handles it.
Note: As was pointed out in the comments, splitting the string on "," is actually not very good, since you run into all sorts of problems. Better to use the csv module (which another answer here teaches how to use).
import win32com
Excel=win32com.client.Dispatch("Excel.Application")
Excel.Workbooks.Open(file path)
Cells=Excel.ActiveWorkBook.ActiveSheet.Cells
Cells(row,column).Value=Input
Output=Cells(row,column).Value
If you can save as a csv file with headers:
Attrib1, Attrib2, Attrib3
value1.1, value1.2, value1.3
value2,1,...
Then I would highly recommend looking at built-in the csv module
With that you can do things like:
csvFile = csv.DictReader(open("csvFile.csv", "r"))
for row in csvFile:
print row['Attrib1'], row['Attrib2']

Categories