Modifying, recalculating and and extracting results from Excel in Python - python

I would like to try and make a program which does the following, preferably in Python, but if it needs to be C# or other, that's OK.
Writes data to an excel spreadsheet
Makes Excel recalculate formulas etc. in the modified spreadsheet
Extracts the results back out
I have used things like openpyxl before, but this obviously can't do step 2. Is the a way of recalculating the spreadsheet without opening it in Excel? Any thoughts greatly appreciated.
Thanks,
Jack

You need some sort of UI automation with which you can control a UI application such as Excel. Excel probably exposes some COM interface that you should be able to use for what you need. Python has the PyWin32 library which you should install, after which you'll have the win32com module available.
See also:
Excel Python API
Automation Excel from Python
If you don't necessarily have to work with Excel specifically and just need to do spreadsheet using Python, you might want to look at http://manns.github.io/pyspread/.

you could you pandas for reading in the data, using python to recalculate and then write the new files.
For pandas it's something like:
#Import Excel file
xls = pd.ExcelFile('path_to_file' + '/' + 'file.xlsx')
xls.parse('nyc-condominium-dataset', index_col='property_id', na_values=['NA'])
so not difficult. Here the link to pandas.
Have fun!

Related

Excel api, but where to store the data

Morning,
I have dynamic data which is updated either daily, weekly or monthly in excel (this is the only api link). However, for use in python, is it better to keep the data stored in excel or transfer it to SQLite and access it from there?
Or is there a more efficient way of managing this process?
thanks
It depends on what you really need (see below, formulae). KISS (Keep it stupid simple) way is often the good one.
Some Python API like xlwt and xlrd can read and write Excel files :
http://www.python-excel.org/
But xlwt and xlrd can't evaluate formulae. If you need formulae, try openpyxl http://openpyxl.readthedocs.org/en/2.5/

Automatic input from text file in excel

My problem is rather simple : I have an Excel Sheet that does calculations and creates a graph based on the values of two cells in the sheet. I also have two lists of inputs in text files. I would like to loop through those text files, add the values to the excel sheet, refresh the sheet, and print the resulting graph to a pdf file or an excel file named something like 'input1 - input2.xlsx'.
My programming knowledge is limited, I am decent with Python and have looked into python libraries that work with excel such as openpyxl, however most of those don't seem to work for me for various reasons. Openpyxl deletes the graphs when opening an excel file; XlsxWriter can only write files, not read from them; and xlwings won't work for me.
Should I use python, which I'm familiar with, or would VBA work for this kind of problem? Have any of you ever done something of the sort?
Thanks in advance
As a more transitional approach to what m. wasowski wrote above, I'd suggest you do the following.
Install the pandas package, and see how easy it is to load a file using read_excel. Then, read 10 Minutes to Pandas, and manipulate the data.
You state that the Excel sheet is complex. In general, the more complex it is, this approach will eventually make it simpler. But you don't have to switch everything immediately. You can still do parts in Excel and parts in pandas.
I think you should consider win32Com for excel operation in python instead of Openpyxl,XlsxWriter.
you can read/write excel, create chart and format excel file using win32com without any limitation.
And creating chart you can consider matplotlib, in that after creating chart you can save it in pdf file also.

Recording unsaved Excel data with python or automsaving with vba

Using python, I want to continuously record some Excel calculations. The python/excel functions I have used will only read excel data from a saved spreadsheet. Is there a way to read unsaved data?
If not, is there a way to save excel and read the saved data periodically? I tried saving my sheets periodically (with a macro) but this is problematic if I am interacting with the spreadsheet during a save. Instead of saving my spreadsheet, excel saves a copy with a random name. If there is a way to remedy this (maybe some kind of vba error handling) that may solve the problem. Thanks.
Assuming you are using Excel on Windows and not OS X, you can try the win32com.client module, which you can get by installing the Python for Windows package:
http://sourceforge.net/projects/pywin32/
Unfortunately, the available documentation is pretty spotty at best... here's a start:
http://docs.activestate.com/activepython/3.3/pywin32/html/com/win32com/HTML/QuickStartClientCom.html
I should warn you that the COM API isn't very easy to use, so you are really better off sticking with VBA if your workflow requires Excel.
VBA error handling: http://www.cpearson.com/excel/errorhandling.htm

Excel Python API

Does anyone know of a way of accessing MS Excel from Python? Specifically I am looking to create new sheets and fill them with data, including formulae.
Preferably I would like to do this on Linux if possible, but can do it from in a VM if there is no other way.
xlwt and xlrd can read and write Excel files, without using Excel itself:
http://www.python-excel.org/
Long time after the original question, but last answer pushed it top of feed again. Others might benefit from my experience using python and excel.
I am using excel and python quite bit. Instead of using the xlrd, xlwt modules directly, I normally use pandas. I think pandas uses these modules as imports, but i find it much easier using the pandas provided framework to create and read the spreadsheets. Pandas's Dataframe structure is very "spreadsheet-like" and makes life a lot easier in my opinion.
The other option that I use (not in direct answer to your problem) is DataNitro. It allows you to use python directly within excel. Different use case, but you would use it where you would normally have to write VBA code in Excel.
there is Python library to read/write Excel 2007 xlsx/xlsm files http://pythonhosted.org/openpyxl/
I wrote python class that allows working with Excel via COM interface in Windows http://sourceforge.net/projects/excelcomforpython/
The class uses win32com to interact with Excel. You can use class directly or use it as example. A lot of options implemented like array formulas, conditional formatting, charts etc.
It's surely possible through the Excel object model via COM: just use win32com modules for Python. Can't remember more but I once controlled the Media Player through COM from Python. It was piece of cake.
Its actually very simple. You can actually run anything from any program. Just see a way to reach command prompt from that program. In case of Excel, create a user defined function by pressing Alt+F11 and paste the following code.
Function call_cmd()
Shell "CMD /C Notepad", vbNormalFocus
End Function
Now press ctrl+s and go back to Excel, select a cell and run the function =call_cmd(). Here I ran Notepad. In the same way, you can see where python.exe is installed and run it. If you want to pass any inputs to python, then save the cells as file in local directory as csv file and read them in python using os.system().

Getting data from an Excel sheet

How do I load data from an Excel sheet into my Django application? I'm using database PosgreSQL as the database.
I want to do this programmatically. A client wants to load two different lists onto the website weekly and they don't want to do it in the admin section, they just want the lists loaded from an Excel sheet. Please help because I'm kind of new here.
Have a look at the xlrd package, which allows you to read Excel files in Python. Once you've read the data you can do whatever you want with it, including saving it to the database.
For a basic usage example, look at http://scienceoss.com/read-excel-files-from-python/
Use django-batchimport http://code.google.com/p/django-batchimport/ It provides a very simple way to upload data in Excel sheets to your Django models. I have used it in a couple of projects. It can be integrated very easily into your existing Django project.
Read the documentation on the project page to know how to use it.
It is built on XLRD.
Have a look at the presentation "Excel & Python" that Chris Withers gave at PyCon US:
"This lightning talk explains that you don't need to use COM or be on Windows to read and write native Excel files."
http://www.simplistix.co.uk/presentations/python_excel_09/excel-lightning.pdf
Programatically or manually? If manualy then just save the excel as a CSV (with csv or txt extension) and import into Postgresql using
copy the_data from '/path/to/csv/MYFILE.txt' DELIMITERS ',' CSV;
As I remember of this.
The best way is to save this sheet as plain text ( CSV or something )
And then load with some custom SQL script.
http://www.postgresql.org/docs/8.3/static/populate.html
Or have a look at SQLAlchemy if you're going to write some kind of script to help you with that.(http://www.sqlalchemy.org/)
If you want to use COM to interface excel (i.e. you are running on a Windows machine), see "Migrating Excel data to SQLite" - http://www.saltycrane.com/blog/2007/11/migrating-excel-to-sqlite-using-python/
I built django-batchimport on top of xlrd which is AMAZING. The only issues I had were with getting data into Django. Had nothing to do with any limitations of xlrd. It rocks. John's work is incredible.
Note that I've actually done some update work to django-batchimport and just released. Take a look: http://code.google.com/p/django-batchimport/
Just started using XLRD and it looks very easy and simple to use.
Beware that it does not support Excel 2007 yet, so keep in mind to save your excel at 2003 format.

Categories