openpyxl cell style not reporting correctly - python

Using the python library openpyxl I am reading an XLSX file created in excel 2007. it is empty apart from cell A1 which is coloured yellow and has the value "test" written in it. I can easily retrieve the value from that cell, however when I attempt to determine the fill colour I get the following results:
this_sheet.cell("A1").style.fill.start_color
returns "FFFFFF"
this_sheet.cell("A1").style.fill.end_color
returns "FF0000"
Testing this on other blank cells I get exactly the same results, and trying to retrieve the font style information keeps returning calibri size 11 (our system default).
Am I going about this all wrong? Is there an alternative method i should be using?
Any help would be greatly appreciated.
Thanks!

Openpyxl is still in development, and styles are not yet completely implemented, thus you can encounter some issues here and there. Don't hesitate to open an issue on the project bug tracker if you want.

Related

Extract cells/Ranges information(col,row) from selected area in active Excel sheet with Python win32com

For my python script, there is one missing simple trick when I want to take data from Excel with Python win32com.
I just want to know how to get selected cells information, e.g. col/row for my python script. For example I could specify the range as shown below, but I simply cannot do the same thing to the selected/active cells.
ws.Range("B1:AM167").CopyPicture()
Does someone help me with this?
I am quite new to win32, so I still do not know how to find correct method/property etc...
Try Application.Selection.
The returned object type depends on the current selection (for
example, if a cell is selected, this property returns a Range object).
The Selection property returns Nothing if nothing is selected.

Python 3 and Excel, Finding complex module to use

I've been looking for ages to find a suitable module to interact with excel, which needs to do the following:
Check a column of cells for an "incorrect" value and change it
Check for empty cells, and if so, replace it
Check a cell value is consistent with the contents of another cell(for example, if called Datasheet, the code in another cell = DS)and if not, change it.
I've looked at openpxyl but I am running Python 3 and I can only seem to find it working for 2.
I've seen a few others but they seem to be mainly focusing creating a new spreadsheet and simple writing/reading.
The Pandas library is amazing to work with excel files. It can read excel files easily and you then have access to a lot of tools. You can do all the operations you mentionned above. You can also save your result in the excel format

Can't find the active or selected cell in excel using Openpyxl

I want to use python to find what the address or coordinates of the currently active or selected cell in an excel spreadsheets currently active sheet.
So far all I've been able to do is the latter. Perhaps I'm just using the wrong words to search. However, this is the first time in two years of writing first VBA and now Python that I haven't been able to just search and find the answer. Even if it took me half a day.
I've crawled through the code at readthedocs (http://openpyxl.readthedocs.org/en/latest/_modules/index.html)
and looked through the openpyxl.cell.cell, openpyxl.worksheet.worksheet, openpyxl.worksheet.views code. The last seemed to have some promise and led me to writing the code below. Still, no joy, and I don't seem to be able to phrase my online searches to be able to pinpoint results that talk about finding the actual active/selected cell. Perhaps this is because openpyxl is really looking at the saved spreadsheet which might not include any data on the last cell to be selected.
I've tried it both in Python 3.4.3 and 2.7.11. Using openpyxl 2.4.0.
Here's the code that got me the closest to my goal. I was running it in Python3.
from openpyxl.worksheet.views import Selection
import openpyxl
wb = openpyxl.load_workbook('example.xlsx')
ws = wb.active
print(wb.get_sheet_names())
print(ws)
print(Selection.activeCell)
Which gives me the below.
['Sheet1', 'Sheet2', 'Sheet3']
<Worksheet "Sheet3">
Values must be of type <class 'str'>
I put in the first two prints just to prove to myself that I'm actually accessing the workbook/sheet.
If I change the last line to:
print(Selection.activeCellId)
I get:
Values must be of type <class 'int'>
I assume this is because these are only for writing not querying. I've toyed with the idea of writing a VBA macro and just running it from python. However, this code will be used with spreadsheets I don't control. By people who aren't necessarily capable of fixing any problems. I don't think I'm capable of writing something good enough to handle any problems that might crop up either.
Any help will be greatly appreciated.
It's difficult to see the purpose of an active cell for a library like openpyxl as it is effectively a GUI artefact. Nevertheless, because openpyxl works hard to implement the OOXML specification it should be possible to read the value stored by the previous application, or write it.
ws.views.sheetView[0].selection[0].activeCell
Consider the win32com library to replicate the Excel VBA property, ActiveCell. Openpyxl might have a limited method for this property while wind32com allows Python to fully utilize the COM libraries of Windows programs including the MS Office Suite (Excel, Word, Access, etc.). You can even manipulate files as a child process as if your were directly writing VBA.
import win32com.client
# OPEN EXCEL APP AND SPREADSHEET
xlApp = win32com.client.Dispatch("Excel.Application")
xlApp.Workbooks.Open('example.xlsx')
xlApp.ActiveWorkbook.Worksheets('Sheet1').Activate
print(xlApp.ActiveCell)
xlApp.ActiveWorkbook.Close(False)
xlApp.Quit
xlApp = None

openpyxl and stdev.p name error

I have a script to format a bunch of data and then push it into excel, where I can easily scrub the broken data, and do a bit more analysis.
As part of this I'm pushing quite a lot of data to excel, and want excel to do some of the legwork, so I'm putting a certain number of formulae into the sheet.
Most of these ("=AVERAGE(...)" "=A1+3" etc) work absolutely fine, but when I add the standard deviation ("=STDEV.P(...)" I get a name error when I open in excel 2013.
If I click in the cell within excel and hit (i.e. don't change anything within the cell), the cell re-calculates without the name error, so I'm a bit confused.
Is there anything extra that needs to be done to get this to work?
Has anyone else had any experience of this?
Thanks,
Will
--
I've investigated further and this is the issue:
When saving the formula "STDEV.P" openpyxl saves it as:
"=_xludf.STDEV.P(...)"
which is correct for many formula, but not this one.
The result should be:
"=_xlfn.STDEV.P(...)"
When I explicitly change the function to the latter, it works as expected.
I'll file a bug report, so hopefully this is done automatically in the future.
I suspect that there might be a subtle difference in what you think you need to write as the formula and what is actually required. openpyxl itself does nothing with the formula, not even check it. You can investigate this by comparing two files (one from openpyxl, one from Excel) with ostensibly the same formula. The difference might be simple – using "." for decimals and "," as a separator between values even if English isn't the language – or it could be that an additional feature is required: Microsoft has continued to extend the specification over the years.
Once you have some pointers please submit a bug report on the openpyxl issue tracker.

Editing workbooks with rich text in openpyxl

I was wondering if openpyxl can read and/or write rich text into excel. I am aware that this question was asked once before in 2012 linked below, but I am not sure if this has changed.
As it stands load_workbook() seems to throw away rich text formatting.
As for a specific problem, I need to open, edit, and save a workbook where some cells have both superscripted and normal text in one cell. When I save the workbook, the format of the first character of the cell is applied to the rest of the cell.
Here is the to 2012 question:
How do I find the formatting for a subset of text in an Excel document cell
After looking around, it seems like rich text was implemented in openpyxl (based on the issues list on openpyxl's bitbucket):
https://bitbucket.org/openpyxl/openpyxl/issues?q=rich+text
But I am still unclear on how to use it (if I interpreted the issues list correctly at all). If it helps at all, I am actually not editing the contents of these cells simply that they don't lose formatting on save.
Any thoughts would be greatly appreciated.
Thanks!
Best
Formatting below the level of the cell is not supported by openpyxl. To use it you'd have to implement your own code when writing as openpyxl just stores whatever strings it receives. Full read/write support would add a great deal of complexity.

Categories