getting sheet names from openpyxl - python

I have a moderately large xlsx file (around 14 MB) and OpenOffice hangs trying to open it. I was trying to use openpyxl to read the content, following this tutorial. The code snippet is as follows:
from openpyxl import load_workbook
wb = load_workbook(filename = 'large_file.xlsx', use_iterators = True)
ws = wb.get_sheet_by_name(name = 'big_data')
The problem is, I don't know the sheet name, and Sheet1/Sheet2.. etc. didn't work (returned NoneType object). I could not find a documentation telling me How to get the sheet names for an xlsx files using openpyxl. Can anyone help me?

Use the sheetnames property:
sheetnames
Returns the list of the names of worksheets in this workbook.
Names are returned in the worksheets order.
Type: list of strings
print (wb.sheetnames)
You can also get worksheet objects from wb.worksheets:
ws = wb.worksheets[0]

As a complement to the other answers, for a particular worksheet, you can also use cf documentation in the constructor parameters:
ws.title

python 3.x
for get sheet name you must use attribute
g_sheet=wb.sheetnames
return by list
for i in g_sheet:
print(i)
**shoose any name **
ws=wb[g_sheet[0]]
or ws=wb[any name]
suppose name sheet is paster
ws=wb["paster"]

As mentioned the earlier answer
you can get the list of sheet names
by using the ws.sheetnames
But if you know the sheet names you can get that worksheet object by
ws.get_sheet_by_name("YOUR_SHEET_NAME")
Another way of doing this is as mentioned in earlier answer
ws['YOUR_SHEET_NAME']

for worksheet in workbook:
print(worksheet.name)

Related

How to parse only specific sheets in a workbook using openpyxl - or how to ignore empty sheets?

Well, this is actually a workaround for my main problem which is to "ignore the empty sheets in my workbook". I have found a way to print only those sheet names that are not empty. So, now I want to pass these names to my workbook and access only those sheets instead of every single sheet in wb. (I need to use openpyxl for this.)
I'm trying the below but it doesn't work:
wb = openpyxl.load_workbook("source_file.xlsx", data_only=TRUE)
for ws in wb.get_sheet_by_name(['Sheet1', 'Sheet2', 'Sheet4', 'Sheet5']):
for row in ws:
<do the necessary parsing operations here>
But this throws the below error:
"Worksheet ['Sheet1', 'Sheet2', 'Sheet4', 'Sheet5'] does not exist."
And if I pass the names separately, then it says:
TypeError: get_sheet_by_name() takes 2 positional arguments but 5 were given
Is there a way that I can tell it to access only specific sheets instead of every sheet in wb? Or better, is it possible to ignore all the empty sheets while parsing a .xlsx workbook?
You can store the sheet names in a list, and then iterate over that list to open each sheet:
import openpyxl
wb = openpyxl.load_workbook("source_file.xlsx", data_only=True)
sheets = ['Sheet1', 'Sheet2', 'Sheet4', 'Sheet5']
for sheet in sheets:
for row in wb[sheet]:
# <do the necessary parsing operations here>
Note that you can simply access a sheet from the workbook wb with wb[sheetname]. get_sheet_by_name() is deprecated. See the official documentation.

How get a excel sheet with its code name property with "python"

I want to get a Excel's sheet with Python. I can do this with the sheet's name but I want get it with its Code Name property. The following is a code using the sheet's name:
from openpyxl import load_workbook
wb_donnees = load_workbook("Données.xlsm", read_only = True)
name_ws_1 = wb_donnees.get_sheet_name()[0]
ws_1 = wb_donnees[name_ws_1]
But I want get the sheet with its Code Name property. Is it possible ?
Charlie Clark's answer works for me in read mode.
I'm not sure whether OP needed this, but when writing a new workbook, you cannot get the codename this way. Instead, you will need to specify it yourself, otherwise the function returns None, and sheets will only be codenamed 'Sheet1' etc at workbook creation.
wb = load_workbook('input.xlsm')
wsx = wb.create_sheet('New Worksheet')
wsx.sheet_properties.codeName = 'wsx'
wb.save('output.xlsm')
The following should will only work if the file is not opened in read-only mode:
from openpyxl import load_workbook
wb = load_workbook("Données.xlsm")
for n in wb.sheetnames:
ws = wb[n]
print(n, ws.sheet_properties.codeName)

Using openpyxl to edit a spreadsheet. Cannot write to cells, "cell is read-only error"

I'm trying to modify an excel worksheet in Python with Openpyxl, but I keep getting an error that says the cells are read only. I know for a fact that they are not because I've been editing these spreadsheets manually for the past two months and have had no issues. Does anyone have an idea of what might be happening? I'm just trying to get my bearings on editing sheets with openpyxl so it is basic code.
rpt = file
workbook = openpyxl.load_workbook(filename = os.path.join('./Desktop/',rpt), use_iterators = True) # Tells which wb to open
wb=workbook
#worksheets = wb.get_sheet_names()
ws = wb.active
ws['A1'] = 42
Any help will be greatly appreciated. Thanks!
Thanks for the responses, to clarify, I'm not getting a workbook is read only error, it is specifically referring to the cells. I'm not sure what's causing this since I know that the workbook is not a read only workbook. Should I be using a different excel library for python? Is there a more robust excel library?
Thanks!
You are opening the workbook in read-only mode which is why the cells are read-only.
In case any other desperate soul is searching for a solution:
As stated in an answer here if you pass use_iterators = True the returned workbook will be read-only.
In newer versions of openpyxl the use_iterators was renamed to read_only, e.g.:
import openpyxl
rpt = file
workbook = openpyxl.load_workbook(filename = os.path.join('./Desktop/',rpt), read_only = True) # Tells which wb to open
wb=workbook
#worksheets = wb.get_sheet_names()
ws = wb.active
ws['A1'] = 42
Will yield:
TypeError: 'ReadOnlyWorksheet' object does not support item assignment
So in order to do the modification you should use read_only = False.

How to access the real value of a cell using the openpyxl module for python

I am having real trouble with this, since the cell.value function returns the formula used for the cell, and I need to extract the result Excel provides after operating.
Thank you.
Ok, I think I ahve found a way around it; apparently to access cell.internal value you have to use the iter_rows() in your worksheet previously, which is a list of "RawCell".
for row in ws.iter_rows():
for cell in row:
print cell.internal_value
Like Charlie Clark already suggest you can set data_only on True when you load your workbook:
from openpyxl import load_workbook
wb = load_workbook("file.xlsx", data_only=True)
sh = wb["Sheet_name"]
print(sh["x10"].value)
From the code it looks like you're using the optimised reader: read_only=True. You can switch between extracting the formula and its result by using the data_only=True flag when opening the workbook.
internal_value was a private attribute that used to refer only to the (untyped) value that Excel uses, ie. numbers with an epoch in 1900 for dates as opposed to the Python form. It has been removed from the library since this question was first asked.
You can try following code.Just provide the excel file path and the location of the cell which value you need in terms of row number and column number below in below code.
from openpyxl import Workbook
wb = Workbook()
Dest_filename = 'excel_file_path'
ws=wb.active
print(ws.cell(row=row_number, column=column_number).value)
Try to use cell.internal_value instead.
Please use this below in Python, and you can get the real values with openpyxl module:
for row in ws.iter_rows(values_only=True):
for cell in row:
print(cell)

copy a worksheet in openpyxl

I'm using openpyxl (unfortunately I don't know how to find out my version number, installed it about a month ago) on Windows with python 2.7 and want to copy a worksheet that I generated using a template.xlsx file to a new workbook. The template has a single worksheet that I alter. I want to load it n times and copy each version as a new worksheet to another workbook. Could also be the same workbook ifneedbe.
I found some hints here which took me here. The example doesn't work as it seems the add_sheet() method has been removed.
primary.add_sheet(copy.deepcopy(ws),ido+1)
AttributeError: 'Workbook' object has no attribute 'add_sheet'
Also couldn't find anything helpful in the API.
I'm afraid copying worksheets is not supported because it is far from easy to do.
I was struggling with it as you. But, I could find out the way to solve.
The best way I think to copy Excel worksheets using Openpyxl and Python:
from openpyxl import Workbook, load_workbook
# workbook source = wb1 and workbook destination = wb2
wb1 = load_workbook('file.xlsx')
wb2 = Workbook()
ws1 = wbs.active
ws2 = wbd.active
for r in plan1.iter_rows():
for c in r:
ws2[c.coordinate] = c.value
wb2.save('file2.xlsx')
The FOR loop with iter_rows() creates a named list with existing filled cells. And the 2nd FOR iterates in those cells ('A1','A2','B1' etc). The method .coordinate can be applied to the cell(c) and extract the Column,Row like 'A1' as a string. If we add it as an index of the worksheet, we can set it as a variable. Then just get the value of the cell(c), the magic is done.
We can do something with data during the loop and after save it to the file.

Categories