Python - Preserve excel pivot tables - python

I am copying an existing file into a new workbook, and then hidding some unnecessary tabs. One of the tabs that needs to be visible contains a pivot table that after the script finishes appears as values (instead of the actual pivot table). I need to "preserve" the pivot table
Edit: Excel 2013 version
This is my code:
import xlsxwriter
import openpyxl as xl
import shutil
shutil.copy('C:/Prueba/GOOG.xlsm', 'C:/Prueba/GOOG-copia.xlsm')
workbook = xl.load_workbook('C:/Prueba/GOOG-copia.xlsm', keep_vba = 'True')
keep = ['Cacaca','Sheet1'] # Cacaca contains a pivot table that needs to be preserved
for i in workbook.sheetnames:
if i in keep:
pivot = workbook[i]._pivots[0]
pivot.cache.refreshOnLoad = True
workbook[i].sheet_state = 'visible'
else:
workbook[i].sheet_state = 'hidden'
workbook.save('C:/Prueba/GOOG-copia.xlsm')
workbook.close
Error:
AttributeError: 'Worksheet' object has no attribute '_pivots'

According to the documentation to preserve a pivot-tables, you have to set at least one their booleans pivot.cache.refreshOnLoad to True
Due to _pivots not existing on a sheet unless it contains an actual pivot-table, we can check for a pivot-table and set the cache if one is found:
for i in workbook.sheetnames:
if i in keep:
ws = workbook[i]
if hasattr(ws, "_pivots"):
pivot = ws._pivots[0]
pivot.cache.refreshOnLoad = True
workbook[i].sheet_state = 'visible'
else:
workbook[i].sheet_state = 'hidden'

Related

Append/Copy a dataframe with multiple columns headers to existing excel file

I'm trying to copy/append a dataframe with multiple column headers(similar to the one below) to an existing excel sheet starting from a particular cell AA2
df1 = pd.DataFrame({'sub1': [np.nan,'E',np.nan,'S'],
'sub2': [np.nan,'D',np.nan,'A']})
df2 = pd.DataFrame({'sub1': [np.nan,'D',np.nan,'S'],
'sub2': [np.nan,'C',np.nan,'S']})
df = pd.concat({'Af':df1, 'Dp':df2}, axis=1)
df
I'm thinking of a solution to export this dataframe to an excel starting in that particular cell and use openpyxl to copy the data from one to another - column by column... but not sure if that is the correct approach. any ideas?!
(the excel sheet that I'm working with has formatting and can't make it into a dataframe and use merge)
I've had success manipulating Excel files in the past with xlsxwriter (you will need to pip install this as a dependency first - although it does not need to be explicitly imported).
import io
import pandas as pd
# Load your file here instead
file_bytes = io.BytesIO()
with pd.ExcelWriter(file_bytes, engine = 'xlsxwriter') as writer:
# Write a DataFrame to Excel into specific cells
pd.DataFrame().to_excel(
writer,
sheet_name = 'test_sheet',
startrow = 10, startcol = 5,
index = False
)
# Note: You can repeat any of these operations within the context manager
# and keep adding stuff...
# Add some text to cells as well:
writer.sheets['test_sheet'].write('A1', 'Your text goes here')
file_bytes.seek(0)
# Then write your bytes to a file...
# Overwriting it in your case?
Bonus:
You can add plots too - just write them to a BytesIO object and then call <your_image_bytes>.seek(0) and then use in insert_image() function.
... # still inside ExcelWriter context manager
plot_bytes = io.BytesIO()
# Create plot in matplotlib here
plt.savefig(plot_bytes, format='png') # Instead of plt.show()
plot_bytes.seek(0)
writer.sheets['test_sheet'].insert_image(
5, # Row start
5, # Col start
'some_image_name.png',
options = {'image_data': plot_bytes}
)
The full documentation is really helpful too:
https://xlsxwriter.readthedocs.io/working_with_pandas.html

Modify an Excel file with Pandas, with minimal change of the layout

I've already read Can Pandas read and modify a single Excel file worksheet (tab) without modifying the rest of the file? but here my question is specific to the layout mentioned hereafter.
How to open an Excel file with Pandas, do some modifications, and save it back:
(1) without removing that there is a Filter on the first row
(2) without modifying the "displayed column width" of the columns as displayed in Excel
(3) without removing the formulas which might be present on some cells
?
Here is what I tried, it's a short example (in reality I do more processing with Pandas):
import pandas as pd
df = pd.read_excel('in.xlsx')
df['AB'] = df['A'].astype(str) + ' ' + df['B'].astype(str) # create a new column from 2 others
del df['Date'] # delete columns
del df['Time']
df.to_excel('out.xlsx', index=False)
With this code, the Filter of the first row is removed and the displayed column width are set to a default, which is not very handy (because we would have to manually set the correct width for all columns).
If you are using a machine that has Excel installed on it, then I highly recommend using the flexible xlwings API. This answers all your questions.
Let's assume I have an Excel file called demo.xlxs in the same directory as my program.
app.py
import xlwings as xw # pip install xlwings
import pandas as pd
wb = xw.Book('demo.xlsx')
This will create a initiate an xl workbook instance and open your Excel editor to allow you to invoke Python commands.
Let's assume we have the following dataframe that we want to use to replace the ID and Name column:
new_name
A John_new
B Adams_new
C Mo_new
D Safia_new
wb.sheets['Sheet1']['A1:B1'].value = df
Finally, you can save and close.
wb.save()
wb.close()
I would recommend xlwings, as it interfaces with excel's COM interfaces (like built-in vba), so it is more powerful. I never tested the "preservation of filtering or formula", official doc may provide ways.
For my own use, I just build everything into python, filtering, formulas, so I don't even touch the excel sheet.
Demo:
# [step 0] boiler plate stuff
df = pd.DataFrame(
index=pd.date_range("2020-01-01 11:11:11", periods=100, freq="min"),
columns=list('abc'))
df['a'] = np.random.randn(100, 1)
df['b'] = df['a'] * 2 + 10
# [step 1] google xlwings, and pip/conda install xlwings
# [step 2] open a new excel sheet, no need to save
# (basically this code will indiscriminally wipe whatever sheet that is active on your desktop)
# [step 3] magic, ...and things you can do
import xlwings as xw
wb = xw.books.active
ws = wb.sheets.active
ws.range('A1').current_region.options(index=1).value = df
# I believe this preserves existing formatting, HOWEVER, it will destory filtering
if 1:
# show casing some formatting you can do
active_window = wb.app.api.ActiveWindow
active_window.FreezePanes = False
active_window.SplitColumn = 2 # const_splitcolumn
active_window.SplitRow = 1
active_window.FreezePanes = True
ws.cells.api.Font.Name = 'consolas'
ws.api.Rows(1).Orientation = 60
ws.api.Columns(1).Font.Bold = True
ws.api.Columns(1).Font.ColorIndex = 26
ws.api.Rows(1).Font.Bold = True
ws.api.Rows(1).Borders.Weight = 4
ws.autofit('c') # 'c' means columns, autofitting columns
ws.range(1,1).api.AutoFilter(1)
This is a solution for (1), (2), but not (3) from my original question. (If you have an idea for (3), a comment and/or another answer is welcome).
In this solution, we open the input Excel file two times:
once with openpyxl: this is useful to keep the original layout (which seems totally discarded when reading as a pandas dataframe!)
once as a pandas dataframe df to benefit from pandas' great API to manipulate/modify the data itself. Note: data modification is handier with pandas than with openpyxl because we have vectorization, filtering df[df['foo'] == 'bar'], direct access to the columns by name df['foo'], etc.
The following code modifies the input file and keeps the layout: the first row "Filter" is not removed and the column width of each colum is not modified.
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl import load_workbook
wb = load_workbook('test.xlsx') # load as openpyxl workbook; useful to keep the original layout
# which is discarded in the following dataframe
df = pd.read_excel('test.xlsx') # load as dataframe (modifications will be easier with pandas API!)
ws = wb.active
df.iloc[1, 1] = 'hello world' # modify a few things
rows = dataframe_to_rows(df, index=False)
for r_idx, row in enumerate(rows, 1):
for c_idx, value in enumerate(row, 1):
ws.cell(row=r_idx, column=c_idx, value=value)
wb.save('test2.xlsx')
I think this is not field of pandas, you must use openpyxl in order to take care of all formatting, blocked_rows, name ranges and so on. Main difference is that you cannot use vectorial computation as in pandas so you need to introduce some loop.

Import Excel Tables into pandas dataframe

I would like to import excel tables (made by using the Excel 2007 and above tabulating feature) in a workbook into separate dataframes. Apologies if this has been asked before but from my searches I couldn't find what I wanted. I know you can easily do this using the read_excel function however this requires the specification of a Sheetname or returns a dict of dataframes for each sheet.
Instead of specifying sheetname, I was wondering whether there was a way of specifying tablename or better yet return a dict of dataframes for each table in the workbook.
I know this can be done by combining xlwings with pandas but was wondering whether this was built-into any of the pandas functions already (maybe ExcelFile).
Something like this:-
import pandas as pd
xls = pd.ExcelFile('excel_file_path.xls')
# to read all tables to a map
tables_to_df_map = {}
for table_name in xls.table_names:
table_to_df_map[table_name] = xls.parse(table_name)
Although not exactly what I was after, I have found a way to get table names with the caveat that it's restricted to sheet name.
Here's an excerpt from the code that I'm currently using:
import pandas as pd
import openpyxl as op
wb=op.load_workbook(file_location)
# Connecting to the specified worksheet
ws = wb[sheetname]
# Initliasing an empty list where the excel tables will be imported
# into
var_tables = []
# Importing table details from excel: Table_Name and Sheet_Range
for table in ws._tables:
sht_range = ws[table.ref]
data_rows = []
i = 0
j = 0
for row in sht_range:
j += 1
data_cols = []
for cell in row:
i += 1
data_cols.append(cell.value)
if (i == len(row)) & (j == 1):
data_cols.append('Table_Name')
elif i == len(row):
data_cols.append(table.name)
data_rows.append(data_cols)
i = 0
var_tables.append(data_rows)
# Creating an empty list where all the ifs will be appended
# into
var_df = []
# Appending each table extracted from excel into the list
for tb in var_tables:
df = pd.DataFrame(tb[1:], columns=tb[0])
var_df.append(df)
# Merging all in one big df
df = pd.concat(var_df,axis=1) # This merges on columns

How to iterate over a particular column in excel using pyxl(python)

I am new to python and need you help.I am trying to write code that iterates through a particular column in excel using pyxl
from io import StringIO
import pandas as pd
import pyodbc
from openpyxl import load_workbook
d=pd.read_excel('workbook.xlsx',header=None)
wb = load_workbook('workbook.xlsx')
SO here in the above example I have to go column J and display all the values in the column.
Please help me solve this.
Also,I have the same column name repeated in my excel sheet..For Example "Sample" column name is available in B2 and also in J2..But I want to get all the column information of J2.
Please let me know how to solve this...
Thankyou ..Please reply
Since you're new to python, you should learn to read the documentation. There are tons of modules available and it will be quicker for you and easier for the rest of us if you make the effort first.
import openpyxl
from openpyxl.utils import cell as cellutils
## My example book simply has "=Address(Row(),Column())" in A1:J20
## Because my example uses formulae, I am loading my workbook with
## "data_only = True" in order to get the values; if your cells do not
## contain formulae, you can omit data_only
workbook = openpyxl.load_workbook("workbook.xlsx", data_only = True)
worksheet = workbook.active
## Alterntively: worksheet = workbook["sheetname"]
## A container for gathering the cell values
output = []
## Current Row = 2 assumes that Cell 1 (in this case, J1) contains your column header
## Adjust as necessary
column = cellutils.column_index_from_string("J")
currentrow = 2
## Get the first cell
cell = worksheet.cell(column = column, row = currentrow)
## The purpose of "While cell.value" is that I'm assuming the column
## is complete when the cell does not contain a value
## If you know the exact range you need, you can either use a for-loop,
## or look at openpyxl.utils.cell.rows_from_range
while cell.value:
## Add Cell value to our list of values for this column
output.append(cell.value)
## Move to the next row
currentrow += 1
## Get that cell
cell = worksheet.cell(column = column, row = currentrow)
print(output)
""" output: ['$J$2', '$J$3', '$J$4', '$J$5', '$J$6', '$J$7',
'$J$8', '$J$9', '$J$10', '$J$11', '$J$12', '$J$13', '$J$14',
'$J$15', '$J$16', '$J$17', '$J$18', '$J$19', '$J$20']

How to copy worksheet from one workbook to another one using openpyxl?

I have a large amount of EXCEL files (i.e. 200) I would like to copy one specific worksheet from one workbook to another one. I have done some investigations and I couldn't find a way of doing it with Openpyxl
This is the code I have developed so far
def copy_sheet_to_different_EXCEL(path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name):
''' Function used to copy one EXCEL sheet into another file.
def path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name
Input data:
1.) path_EXCEL_read: the location of the EXCEL file along with the name where the information is going to be saved
2.) Sheet_name_to_copy= The name of the EXCEL sheet to copy
3.) path_EXCEL_Save: The path of the EXCEL file where the sheet is going to be copied
3.) Sheet_new_name: The name of the new EXCEL sheet
Output data:
1.) Status= If 0, everything went OK. If 1, one error occurred.
Version History:
1.0 (2017-02-20): Initial version.
'''
status=0
if(path_EXCEL_read.endswith('.xls')==1):
print('ERROR - EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
status=1
return status
try:
wb = openpyxl.load_workbook(path_EXCEL_read,read_only=True)
except:
print('ERROR - EXCEL file does not exist in the following location:\n {0}'.format(path_EXCEL_read))
status=1
return status
Sheet_names=wb.get_sheet_names() # We copare against the sheet name we would like to cpy
if ((Sheet_name_to_copy in Sheet_names)==0):
print('ERROR - EXCEL sheet does not exist'.format(Sheet_name_to_copy))
status=1
return status
# We checking if the destination file exists
if (os.path.exists(path_EXCEL_Save)==1):
#If true, file exist so we open it
if(path_EXCEL_Save.endswith('.xls')==1):
print('ERROR - Destination EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
status=1
return status
try:
wdestiny = openpyxl.load_workbook(path_EXCEL_Save)
except:
print('ERROR - Destination EXCEL file does not exist in the following location:\n {0}'.format(path_EXCEL_read))
status=1
return status
#we check if the destination sheet exists. If so, we will delete it
destination_list_sheets = wdestiny.get_sheet_names()
if((Sheet_new_name in destination_list_sheets) ==True):
print('WARNING - Sheet "{0}" exists in: {1}. It will be deleted!'.format(Sheet_new_name,path_EXCEL_Save))
wdestiny.remove_sheet(Sheet_new_name)
else:
wdestiny=openpyxl.Workbook()
# We copy the Excel sheet
try:
sheet_to_copy = wb.get_sheet_by_name(Sheet_name_to_copy)
target = wdestiny.copy_worksheet(sheet_to_copy)
target.title=Sheet_new_name
except:
print('ERROR - Could not copy the EXCEL sheet. Check the file')
status=1
return status
try:
wdestiny.save(path_EXCEL_Save)
except:
print('ERROR - Could not save the EXCEL sheet. Check the file permissions')
status=1
return status
#Program finishes
return status
I had the same problem. For me style, format, and layout were very important. Moreover, I did not want to copy formulas but only the value (of the formulas). After a lot of trail, error, and stackoverflow I came up with the following functions. It may look a bit intimidating but the code copies a sheet from one Excel file to another (possibly existing file) while preserving:
font and color of text
filled color of cells
merged cells
comment and hyperlinks
format of the cell value
the width of every row and column
whether or not row and column are hidden
frozen rows
It is useful when you want to gather sheets from many workbooks and bind them into one workbook. I copied most attributes but there might be a few more. In that case you can use this script as a jumping off point to add more.
###############
## Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
## Please add the ..path\\+\\file.. and ..sheet_name.. according to your desire.
import openpyxl
from copy import copy
def copy_sheet(source_sheet, target_sheet):
copy_cells(source_sheet, target_sheet) # copy all the cel values and styles
copy_sheet_attributes(source_sheet, target_sheet)
def copy_sheet_attributes(source_sheet, target_sheet):
target_sheet.sheet_format = copy(source_sheet.sheet_format)
target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
target_sheet.merged_cells = copy(source_sheet.merged_cells)
target_sheet.page_margins = copy(source_sheet.page_margins)
target_sheet.freeze_panes = copy(source_sheet.freeze_panes)
# set row dimensions
# So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
for rn in range(len(source_sheet.row_dimensions)):
target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])
if source_sheet.sheet_format.defaultColWidth is None:
print('Unable to copy default column wide')
else:
target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)
# set specific column width and hidden property
# we cannot copy the entire column_dimensions attribute so we copy selected attributes
for key, value in source_sheet.column_dimensions.items():
target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min) # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max) # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)
def copy_cells(source_sheet, target_sheet):
for (row, col), source_cell in source_sheet._cells.items():
target_cell = target_sheet.cell(column=col, row=row)
target_cell._value = source_cell._value
target_cell.data_type = source_cell.data_type
if source_cell.has_style:
target_cell.font = copy(source_cell.font)
target_cell.border = copy(source_cell.border)
target_cell.fill = copy(source_cell.fill)
target_cell.number_format = copy(source_cell.number_format)
target_cell.protection = copy(source_cell.protection)
target_cell.alignment = copy(source_cell.alignment)
if source_cell.hyperlink:
target_cell._hyperlink = copy(source_cell.hyperlink)
if source_cell.comment:
target_cell.comment = copy(source_cell.comment)
wb_target = openpyxl.Workbook()
target_sheet = wb_target.create_sheet(..sheet_name..)
wb_source = openpyxl.load_workbook(..path\\+\\file_name.., data_only=True)
source_sheet = wb_source[..sheet_name..]
copy_sheet(source_sheet, target_sheet)
if 'Sheet' in wb_target.sheetnames: # remove default sheet
wb_target.remove(wb_target['Sheet'])
wb_target.save('out.xlsx')
i found a way playing around with it
import openpyxl
xl1 = openpyxl.load_workbook('workbook1.xlsx')
# sheet you want to copy
s = openpyxl.load_workbook('workbook2.xlsx').active
s._parent = xl1
xl1._add_sheet(s)
xl1.save('some_path/name.xlsx')
You cannot use copy_worksheet() to copy between workbooks because it depends on global constants that may vary between workbooks. The only safe and reliable way to proceed is to go row-by-row and cell-by-cell.
You might want to read the discussions about this feature
For speed I am using data_only and read_only attributes when opening my workbooks. Also iter_rows() is really fast, too.
#Oscar's excellent answer needs some changes to support ReadOnlyWorksheet and EmptyCell
# Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
# Please add the ..path\\+\\file.. and ..sheet_name.. according to your desire.
import openpyxl
from copy import copy
def copy_sheet(source_sheet, target_sheet):
copy_cells(source_sheet, target_sheet) # copy all the cel values and styles
copy_sheet_attributes(source_sheet, target_sheet)
def copy_sheet_attributes(source_sheet, target_sheet):
if isinstance(source_sheet, openpyxl.worksheet._read_only.ReadOnlyWorksheet):
return
target_sheet.sheet_format = copy(source_sheet.sheet_format)
target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
target_sheet.merged_cells = copy(source_sheet.merged_cells)
target_sheet.page_margins = copy(source_sheet.page_margins)
target_sheet.freeze_panes = copy(source_sheet.freeze_panes)
# set row dimensions
# So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
for rn in range(len(source_sheet.row_dimensions)):
target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])
if source_sheet.sheet_format.defaultColWidth is None:
print('Unable to copy default column wide')
else:
target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)
# set specific column width and hidden property
# we cannot copy the entire column_dimensions attribute so we copy selected attributes
for key, value in source_sheet.column_dimensions.items():
target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min) # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max) # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)
def copy_cells(source_sheet, target_sheet):
for r, row in enumerate(source_sheet.iter_rows()):
for c, cell in enumerate(row):
source_cell = cell
if isinstance(source_cell, openpyxl.cell.read_only.EmptyCell):
continue
target_cell = target_sheet.cell(column=c+1, row=r+1)
target_cell._value = source_cell._value
target_cell.data_type = source_cell.data_type
if source_cell.has_style:
target_cell.font = copy(source_cell.font)
target_cell.border = copy(source_cell.border)
target_cell.fill = copy(source_cell.fill)
target_cell.number_format = copy(source_cell.number_format)
target_cell.protection = copy(source_cell.protection)
target_cell.alignment = copy(source_cell.alignment)
if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.hyperlink:
target_cell._hyperlink = copy(source_cell.hyperlink)
if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.comment:
target_cell.comment = copy(source_cell.comment)
With a usage something like
wb = Workbook()
wb_source = load_workbook(filename, data_only=True, read_only=True)
for sheetname in wb_source.sheetnames:
source_sheet = wb_source[sheetname]
ws = wb.create_sheet("Orig_" + sheetname)
copy_sheet(source_sheet, ws)
wb.save(new_filename)
I had a similar requirement to collate data from multiple workbooks into one workbook. As there are no inbuilt methods available in openpyxl.
I created the below script to do the job for me.
Note: In my usecase all worbooks contain data in same format.
from openpyxl import load_workbook
import os
# The below method is used to read data from an active worksheet and store it in memory.
def reader(file):
global path
abs_file = os.path.join(path, file)
wb_sheet = load_workbook(abs_file).active
rows = []
# min_row is set to 2, to ignore the first row which contains the headers
for row in wb_sheet.iter_rows(min_row=2):
row_data = []
for cell in row:
row_data.append(cell.value)
# custom column data I am adding, not needed for typical use cases
row_data.append(file[17:-6])
# Creating a list of lists, where each list contain a typical row's data
rows.append(row_data)
return rows
if __name__ == '__main__':
# Folder in which my source excel sheets are present
path = r'C:\Users\tom\Desktop\Qt'
# To get the list of excel files
files = os.listdir(path)
for file in files:
rows = reader(file)
# below mentioned file name should be already created
book = load_workbook('new.xlsx')
sheet = book.active
for row in rows:
sheet.append(row)
book.save('new.xlsx')
My workaround goes like this:
You have a template file let's say it's "template.xlsx".
You open it, make changes to it as needed, save it as a new file, close the file.
Repeat as needed. Just make sure to keep a copy of the original template while testing/messing around.
I've just found this question. A good workaround, as mentioned here, could consists in modifying the original wb in memory and then saving it with another name. For example:
import openpyxl
# your starting wb with 2 Sheets: Sheet1 and Sheet2
wb = openpyxl.load_workbook('old.xlsx')
sheets = wb.sheetnames # ['Sheet1', 'Sheet2']
for s in sheets:
if s != 'Sheet2':
sheet_name = wb.get_sheet_by_name(s)
wb.remove_sheet(sheet_name)
# your final wb with just Sheet1
wb.save('new.xlsx')
A workaround I use is saving the current sheet as a pandas data frame and loading it to the excel workbook you need
It actually can be done in a very simple way !
It just need 3 steps :
Open a file using load_workbook
wb = load_workbook('File_1.xlsx')
Select a sheet you want to copy
ws = wb.active
use name of the new file to save the file
wb.save('New_file.xlsx')
This code will save sheet of first file (File_1.xlsx) to the secound file (New_file.xlsx).

Categories