Save .dbf as .xls using python - python

Is there a simple way to save a .dbf file as a .xls file using python. I just want to open the .dbf file in python and then immediately do a 'save as .xls'. I'm trying to avoid looping through records in a dbf and writing to the excel table.
Something like:
fileobj = open(outFolder + "\\" + "TABLE.dbf")
fileobj.save(outFolder + "\\" + "TABLE.xls")
I've searched around the internet, but haven't found anything, so I thought I'd try a post.
Thanks,
Mike

If you were willing to accept using the Python csv module (which would involve looping through dbf records and writing csv), consider going straight to xls by using xlwt.
Alternatively, could your client could be shown how to open dbf files in Excel?

Related

Looping through an xlsx file

I have two questions regarding reading data from a file in .xlsx format.
Is it possible to convert an .xlsx file to .csv without actually opening the file in pandas or using xlrd? Because when I have to open many files this is quite slow and I was trying to speed it up.
Is it possible to use some sort of for loop to loop through decoded xlsx lines? I put an example below.
xlsx_file = 'some_file.xlsx'
with open(xlsx_file) as lines:
for line in lines:
<do something like I would do for a normal string>
I would like to know if this is possible without the well known xlrd module.

Excel fails to open Python-generated CSV files

I have many Python scripts that output CSV files. It is occasionally convenient to open these files in Excel. After installing OS X Mavericks, Excel no longer opens these files properly: Excel doesn't parse the files and it duplicates the rows of the file until it runs out of memory. Specifically, when Excel attempts to open the file, a prompt appears that reads: "File not loaded completely."
Example of code I'm using to generate the CSV files:
import csv
with open('csv_test.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerow([1,2,3])
writer.writerow([4,5,6])
Even the simple file generated by the above code fails to load in Excel. However, if I open the CSV file in a text editor and copy/paste the text into Excel, parse it with text to columns, and then save as CSV from Excel, then I can reopen the CSV file in Excel without issue. Do I need to pass an additional parameter in my scripts to make Excel parse the CSV files the same way it used to? Or is there some setting I can change in OS X Mavericks or Excel? Thanks.
Maybe I had the similar problem, the error message "SYLK: File format is not valid" when open python autogenerated csv file. The solution is really funny. The first two characters must not be I and D in uppercase (ID). Also see "SYLK: File format is not valid" error message when you open file.
Possible solution1: use *.txt instead of *.csv. In this case Excel (at least, 2010) will show you an import data wizard where you can specify delimiters, character encoding, field types, etc.
UPD: Solution2:
The python "csv" module has a "dialect" feature. For example, the following modification of your code generates valid csv file for my environment (Python 2.7, Excel 2010, Windows7, locale with ";" list delimiters):
import csv
with open('csv_test2.csv', 'wb') as f:
csv.excel.delimiter=';'
writer = csv.writer(f, dialect=csv.excel)
writer.writerow([1,2,3])
writer.writerow([4,5,6])

How to rename files and change the file type as well?

I have a list with .dbf files which I want to change to .csv files. By hand I open them in excel and re-save them as .csv, but this takes too much time.
Now I made a script which changes the file name, but when I open it, it is still a .dbf file type (although it is called .csv). How can I rename the files in such a way that the file type also changes?
My script uses (the dbf and csv file name are listed in a seperate csv file):
IN = dbffile name
OUT = csvfile name
for output_line in lstRename:
shutil.copyfile(IN,OUT)
Changing the name of a file (and the extension is just part of the complete name) has absolutely no effect on the contents of the file. You need to somehow convert the contents from one format to the other.
Using my dbf module and python it is quite simple:
import dbf
IN = 'some_file.dbf'
OUT = 'new_name.csv'
dbf.Table(IN).export(filename=OUT)
This will create a .csv file that is actually in csv format.
If you have ever used VB or looked into VBA, you can write a simple excel script to open each file, save it as csv and then save it with a new name.
Use the macro recorder to record you once doing it yourself and then edit the resulting script.
I have now created a application that automates this. Its called xlsto (look for the xls.exe release file). It allows you to pick a folder and convert all xls files to csv (or any other type).
You need a converter
Search for dbf2csv in google.
It depends what you want to do. It seems like you want to convert files to other types. There are many converters out there, but a computer alone doesn't know every file type. For that you will need to download some software. If all you want to do is change the file extension,
(ex. .png, .doc, .wav) then you can set your computer to be able to change both the name and the extension. I hoped I helped in some way :)
descargar libreria dbfpy desde http://sourceforge.net/projects/dbfpy/?source=dlp
import csv,glob
from dbfpy import dbf
entrada = raw_input(" entresucarpetadbf ::")
lisDbf = glob.glob(entrada + "\\*dbf")
for db in lisDbf:
print db
try:
dbfFile = dbf.Dbf(open(db,'r'))
csvFile = csv.writer(open(db[:-3] + "csv", 'wb'))
headers = range(len(dbfFile.fieldNames))
allRows = []
for row in dbfFile:
rows = []
for num in headers:
rows.append(row[num])
allRows.append(rows)
csvFile.writerow(dbfFile.fieldNames)
for row in allRows:
print row
csvFile.writerow(row)
except Exception,e:
print e
It might be that the new file name is "xzy.csv.dbf". Usually in C# I put quotes in the filename. This forces the OS to change the filename. Try something like "Xzy.csv" in quotes.

Python - Convert CSV to DBF

I would like to convert a csv file to dbf using python (for use in geocoding which is why I need the dbf file) - I can easily do this in stat/transfer or other similar programs but I would like to do as part of my script rather than having to go to an outside program. There appears to be a lot of help questions/answers for converting DBF to CSV but I am not having any luck the other way around.
An answer using dbfpy is fine, I just haven't had luck figuring out exactly how to do it.
As an example of what I am looking for, here is some code I found online for converting dbf to csv:
import csv,arcgisscripting
from dbfpy import dbf
gp = arcgisscripting.create()
try:
inFile = gp.GetParameterAsText(0) #Input
outFile = gp.GetParameterAsText(1)#Output
dbfFile = dbf.Dbf(open(inFile,'r'))
csvFile = csv.writer(open(outFile, 'wb'))
headers = range(len(dbfFile.fieldNames))
allRows = []
for row in dbfFile:
rows = []
for num in headers:
rows.append(row[num])
allRows.append(rows)
csvFile.writerow(dbfFile.fieldNames)
for row in allRows:
print row
csvFile.writerow(row)
except:
print gp.getmessage()
It would be great to get something similar for going the other way around.
Thank you!
Duplicate question at: Convert .csv file into .dbf using Python?
Promising answer there (among others) is
Use the csv library to read your data from the csv file. The third-party dbf library can write a dbf file for you.
For example, you could try:
http://packages.python.org/dbf/
http://code.activestate.com/recipes/362715-dbf-reader-and-writer/
You could also just open the CSV file in OpenOffice or Excel and save it in dBase format.
I assume you want to create attribute files for the Esri Shapefile format or something like that. Keep in mind that DBF files usually use ancient character encodings like CP 850. This may be a problem if your geo data contains names in foreign languages. However, Esri may have specified a different encoding.
EDIT: just noted that you do not want to use external tools.

How do i extract specific lines of data from a huge Excel sheet using Python?

I need to get specific lines of data that have certain key words in them (names) and write them to another file. The starting file is a 1.5 GB Excel file. I can't just open it up and save it as a different format. How should I handle this using python?
I'm the author and maintainer of xlrd. Please edit your question to provide answers to the following questions. [Such stuff in SO comments is VERY hard to read]
How big is the file in MB? ["Huge" is not a useful answer]
What software created the file?
How much memory do you have on your computer?
Exactly what happens when you try to open the file using Excel? Please explain "I can open it partially".
Exactly what is the error message that you get when you try to open "C:\bigfile.xls" with your script using xlrd.open_workbook? Include the script that you ran, the full traceback, and the error message
What operating system, what version of Python, what version of xlrd?
Do you know how many worksheets there are in the file?
It sounds to me like you have a spreadsheet that was created using Excel 2007 and you have only Excel 2003.
Excel 2007 can create worksheets with 1,048,576 rows by 16,384 columns while Excel 2003 can only work with 65,536 rows by 256 columns. Hence the reason you can't open the entire worksheet in Excel.
If the workbook is just bigger in dimension then xlrd should work for reading the file, but if the file is actually bigger than the amount of memory you have in your computer (which I don't think is the case here since you can open the file with EditPad lite) then you would have to find an alternate method because xlrd reads the entire workbook into memory.
Assuming the first case:
import xlrd
wb_path = r'c:\bigfile.xls'
output_path = r'c:\output.txt'
wb = xlrd.open(wb_path)
ws = wb.sheets()[0] # assuming you want to work with the first sheet in the workbook
with open(output_path, 'w') as output_file:
for i in xrange(ws.nrows):
row = [cell.value for cell in ws.row(i)]
# ... replace the following if statement with your own conditions ...
if row[0] == u'interesting':
output_file.write('\t'.join(row) + '\r\n')
This will give you a tab-delimited output file that should open in Excel.
Edit:
Based on your answer to John Machin's question 5, make sure there is a file called 'bigfile.xls' located in the root of your C drive. If the file isn't there, change the wb_path to the correct location of the file you want to open.
I haven't used it, but xlrd looks like it does a good job reading Excel data.
Your problem is that you are using Excel 2003 .. You need to use a more recent version to be able to read this file. 2003 will not open files bigger than 1M rows.

Categories