Excel api, but where to store the data - python

Morning,
I have dynamic data which is updated either daily, weekly or monthly in excel (this is the only api link). However, for use in python, is it better to keep the data stored in excel or transfer it to SQLite and access it from there?
Or is there a more efficient way of managing this process?
thanks

It depends on what you really need (see below, formulae). KISS (Keep it stupid simple) way is often the good one.
Some Python API like xlwt and xlrd can read and write Excel files :
http://www.python-excel.org/
But xlwt and xlrd can't evaluate formulae. If you need formulae, try openpyxl http://openpyxl.readthedocs.org/en/2.5/

Related

Is it possible to use Python XlsxWriter to write a stock data type to a cell?

I have an interest in utilizing the stock quote feature of excel while also creating the file using the xlsxwriter libray in Python. I am familiar with how to write and format text using xlsxwriter but I do not see any option to create a file with certain cells already set to have a stock data type. To be clear, the link from microsoft below basically summarizes the manual process I'm looking to have taken care of in the excel sheet before an actual user ever opens up the file.
https://support.microsoft.com/en-us/office/get-a-stock-quote-e5af3212-e024-4d4c-bea0-623cf07fbc54
I am open to other python based solutions to this issue if the general consensus is that xlsxwriter doesn't support this feature. I really appreciate any advice here.
I am the author of XlsxWriter. I just looked into this and these aren't regular Excel formulas. They have a lot of of additional metadata and richdata helper files associated with them and even the company names aren't standard string types. So unfortunately these aren't, and probably won't be, supported.

Modifying, recalculating and and extracting results from Excel in Python

I would like to try and make a program which does the following, preferably in Python, but if it needs to be C# or other, that's OK.
Writes data to an excel spreadsheet
Makes Excel recalculate formulas etc. in the modified spreadsheet
Extracts the results back out
I have used things like openpyxl before, but this obviously can't do step 2. Is the a way of recalculating the spreadsheet without opening it in Excel? Any thoughts greatly appreciated.
Thanks,
Jack
You need some sort of UI automation with which you can control a UI application such as Excel. Excel probably exposes some COM interface that you should be able to use for what you need. Python has the PyWin32 library which you should install, after which you'll have the win32com module available.
See also:
Excel Python API
Automation Excel from Python
If you don't necessarily have to work with Excel specifically and just need to do spreadsheet using Python, you might want to look at http://manns.github.io/pyspread/.
you could you pandas for reading in the data, using python to recalculate and then write the new files.
For pandas it's something like:
#Import Excel file
xls = pd.ExcelFile('path_to_file' + '/' + 'file.xlsx')
xls.parse('nyc-condominium-dataset', index_col='property_id', na_values=['NA'])
so not difficult. Here the link to pandas.
Have fun!

Python - Reading a spreadsheet

What I need to know is, can I get Python to read a spreadsheet (preferably Microsoft Excel), then parse the information and input it into an equation?
It's for a horse-racing program, where the information for several horses will be in one excel spreadsheet, in different rows or columns. I need to know if I can run a calculation for each of those horses separately and then calculate a score for the given horse.
My suggestion is:
Save the Excel file as a csv comma separated value file, which is a plain text format and much easier to work with.
Use Python's built-in csv module to work with the data in csv format.
You can work with Excel files directly in Python (Excel 2003 format supported via the third party modules xlwt, xlrd) but this is much harder than working with CSV.
OpenPyXL ("A Python library to read/write Excel 2007 xlsx/xlsm files") has a very nice and Pythonic API.
Use xlrd package. It's on PyPI, so you can just easy_install xlrd
You can export the spreadsheet as a .csv and read it in as a text file, then process it. I have a niggling feeling there might even a CSV parsing python library.
AFAIK there isn't a .xls parser, although I might be wrong.
EDIT: I was wrong: http://www.python-excel.org/

Merging .xlsx file with Python

Two-headed question here guys,
First, I've been trying to do some searching for a way to read .xlsx files in python. Does xlrd read .xlsx files now? If not, what's the recommended way to read/write to such a file?
Second, I have two files with similar information. One primary field with scoping subfields (like coordinates(the primary field) -> city -> state -> country). In the older file, the information is given an ID number while the newer file (with records deleted/added) does not have these ID's. In python, I'd 1) open the two files 2) check the primary field of the older file against the primary field of the newer file and merge their information to a new file if they match. Given that its not too big of a file, I don't mind the O(n^2) complexity. My question is this: is there a well-defined way to do this in VBA or excel? Everything I think of using excel's library seems too slow and I'm not excellent with VBA.
I frequently access excel files through python and xlrd, python and the Excel COM object. For this job, xlrd won't work because it does not support the xlsx format. But no matter, both approaches are overkill for what you are looking for. Simple Excel formulas will deliver what you want, specifically VLOOKUP.
VLOOKUP "looks for a value in the lefmost column of a table, and then returns a value in the same row from the column you specify".
Some advice on VLOOKUP, First, if you want to match on multiple cells, create a "key" cell which concatenates the cells you are interested in (in both workbooks). Second, make sure to set the last argument to VLOOKUP as FALSE because you will only want exact matches.
Regarding performance, excel formulas are often very fast.
Read the help file on VLOOKUP and ask further questions here.
Late edit (from Mark Baker's answer): There is now a python solution for xlsx. Openpyxl was created this year by Eric Gazoni to read and write Excel's xlsx format.
I only heard about this project this morning, so I've not had an opportunity to look at it, and have no idea what it's like; but take a look at Eric' Gazoni's openpyxl project. The code can be found on bitbucket. The driving force behind this was the ability to read/write xlsx files from Python.
Try http://www.python-excel.org/
My mistake - I missed the .xlsx detail.
I guess it's a question of what's easier: finding or writing a library that handles .xlsx format natively OR save all the Excel spreadsheets as .xls and get on with it with the libraries that merely handle the older format.
Adding on the answer of Steven Rubalski:
You might want to be able to have your lookup value in any other than the leftmost column. In those cases the Index and Match functions come in handy.
See: http://www.mrexcel.com/articles/excel-vlookup-index-match.php

Getting data from an Excel sheet

How do I load data from an Excel sheet into my Django application? I'm using database PosgreSQL as the database.
I want to do this programmatically. A client wants to load two different lists onto the website weekly and they don't want to do it in the admin section, they just want the lists loaded from an Excel sheet. Please help because I'm kind of new here.
Have a look at the xlrd package, which allows you to read Excel files in Python. Once you've read the data you can do whatever you want with it, including saving it to the database.
For a basic usage example, look at http://scienceoss.com/read-excel-files-from-python/
Use django-batchimport http://code.google.com/p/django-batchimport/ It provides a very simple way to upload data in Excel sheets to your Django models. I have used it in a couple of projects. It can be integrated very easily into your existing Django project.
Read the documentation on the project page to know how to use it.
It is built on XLRD.
Have a look at the presentation "Excel & Python" that Chris Withers gave at PyCon US:
"This lightning talk explains that you don't need to use COM or be on Windows to read and write native Excel files."
http://www.simplistix.co.uk/presentations/python_excel_09/excel-lightning.pdf
Programatically or manually? If manualy then just save the excel as a CSV (with csv or txt extension) and import into Postgresql using
copy the_data from '/path/to/csv/MYFILE.txt' DELIMITERS ',' CSV;
As I remember of this.
The best way is to save this sheet as plain text ( CSV or something )
And then load with some custom SQL script.
http://www.postgresql.org/docs/8.3/static/populate.html
Or have a look at SQLAlchemy if you're going to write some kind of script to help you with that.(http://www.sqlalchemy.org/)
If you want to use COM to interface excel (i.e. you are running on a Windows machine), see "Migrating Excel data to SQLite" - http://www.saltycrane.com/blog/2007/11/migrating-excel-to-sqlite-using-python/
I built django-batchimport on top of xlrd which is AMAZING. The only issues I had were with getting data into Django. Had nothing to do with any limitations of xlrd. It rocks. John's work is incredible.
Note that I've actually done some update work to django-batchimport and just released. Take a look: http://code.google.com/p/django-batchimport/
Just started using XLRD and it looks very easy and simple to use.
Beware that it does not support Excel 2007 yet, so keep in mind to save your excel at 2003 format.

Categories