I have a Python code that, at the end of the process, it creates an Excel file with several worksheets, what I'm trying to do is copy a sheer from another file that is read with its exact format (cells with background color, different fonts and letter sizes, etc) and paste it as it is in the main file without affecting the other previously-created sheets, the method that I'm currently using doesn't allow me to do that because it overwrites the new file over the previously-created one. Does someone have a suggestion or way of doing this?
The method I'm currently using, which is obtained from: Read an excel file with Python and modify it without changing the style:
from openpyxl import Workbook, load_workbook
workbook2 = load_workbook("readme tab.xlsx") # Your Excel file
worksheet2 = workbook2.active # gets first sheet
for row in range(1, 10):
# Writes a new value PRESERVING cell styles.
worksheet2.cell(row=row, column=1, value=f'NEW VALUE {row}')
workbook2.save(path)
Reference of the code I'm using, in order:
import xlsxwriter
import pandas as pd
path = r"Archivo.xlsx"
writer = pd.ExcelWriter(path)
df1.to_excel(writer, sheet_name='Data')
workbook = writer.book
worksheet = writer.sheets['Data']
ws = workbook.add_worksheet('Graph')
worksheet.set_column(1, 29, 30)
writer.save()
from openpyxl import Workbook, load_workbook
workbook2 = load_workbook("readme tab.xlsx") # Your Excel file
worksheet2 = workbook2.active # gets first sheet
for row in range(1, 10):
# Writes a new value PRESERVING cell styles.
worksheet2.cell(row=row, column=1, value=f'NEW VALUE {row}')
workbook2.save(path)
You can copy a sheet (including its format style) with the method add_sheet.
And assuming that workbook is the workbook to which you're adding the sheet :
Replace :
workbook2 = load_workbook("readme tab.xlsx") # Your Excel file
worksheet2 = workbook2.active # gets first sheet
for row in range(1, 10):
# Writes a new value PRESERVING cell styles.
worksheet2.cell(row=row, column=1, value=f'NEW VALUE {row}')
workbook2.save(path)
By :
ws2 = load_workbook('readme tab.xlsx').active
ws2._parent = workbook
workbook._add_sheet(ws2)
workbook.save(path)
Related
dear community
I was struggling with a piece of code in Python that could get data from a Excel worksheet by reading and after create a new sheet with that data. `
It's not just a copy of the file, because it allows to make something with data on the way before saving it in a new file.
I was reading a file, saving in a intermediary list and after trying to save in the new xls file.
It didn't work because of data type weren't talking with each other. And I got stuck.
I saw this code below from Python Engineering by Michael Zippo, that helped me.
# importing openpyxl module
import openpyxl as xl;
# opening the source excel file
filename ="C:\\Users\\Admin\\Desktop\\trading.xlsx"
wb1 = xl.load_workbook(filename)
ws1 = wb1.worksheets[0]
# opening the destination excel file
filename1 ="C:\\Users\\Admin\\Desktop\\test.xlsx"
wb2 = xl.load_workbook(filename1)
ws2 = wb2.active
# calculate total number of rows and
# columns in source excel file
mr = ws1.max_row
mc = ws1.max_column
# copying the cell values from source
# excel file to destination excel file
for i in range (1, mr + 1):
for j in range (1, mc + 1):
# reading cell value from source excel file
c = ws1.cell(row = i, column = j)
# writing the read value to destination excel file
ws2.cell(row = i, column = j).value = c.value
# saving the destination excel file
wb2.save(str(filename1))
After looking up to new thing about Michael Zippo, (https://python.engineering/python-how-to-copy-data-from-one-excel-sheet-to-another/).
I found a way to improve the read-write FOR loop above:
from openpyxl import Workbook, load_workbook
wb1 = load_workbook('bank_statement.xlsx')
wb2 = Workbook()
sh1 = wb1.active
sh2 = wb2.active
for r in sh1.iter_rows():
for c in r:
sh2[c.coordinate]= c.value
wb2.save('bank_stat_improved.xlsx')
In the middle of the loop, you can do something with data and it will be a very useful code.
I am successfully copying the whole columns from one existing excel file to another file in python, but cannot copy a specific column from an existing excel file and writing it into another.
Here is my code
wb = load_workbook('Ambulance2Centroids_16622.xlsx')
wb2 = load_workbook('test.xlsx')
sheet1 = wb.get_sheet_by_name('Sheet1')
sheet2 = wb2.get_sheet_by_name('Details')
for i in range (1, 10):
for j in range (1, sheet1.max_column+1):
sheet2.cell(row=i, column=j).value = sheet1.cell(row=i, column=j).value
wb.save('Ambulance2Centroids_16622.xlsx')
wb2.save('test.xlsx')
Here, i am trying to get FROM_ID only.
A couple of things to note:
The get_sheet_by_name attribute is depreciated you should just use wb[<sheetname>] as shown below.
There is no need to save a workbook (wb) that you have not changed. Since you are only reading data from 'Ambulance2Centroids_16622.xlsx' to copy to 'test.xlsx' there are no changes to that wb and no need to save it.
The example below shows how to find the column in the original wb, in this case 'FROM_ID' and then copy the column to the destination wb 'test.xlsx'.
from openpyxl import load_workbook
wb = load_workbook('Ambulance2Centroids_16622.xlsx')
wb2 = load_workbook('test.xlsx')
# Use wb[<sheetname>] to assign sheets to variable
sheet1 = wb['Sheet1']
sheet2 = wb2['Details']
search_text = 'FROM_ID'
for header_row in sheet1[1]: # sheet1[1] means iterate row 1, header row
if header_row.value == search_text:
# Use the column letter of the found column to iterate the originating column and copy the cell value
for dest_row, orig_col_c in enumerate(sheet1[header_row.column_letter], 1):
# Copying the originating cell value to column A (1) in destination wb
sheet2.cell(row=dest_row, column=1).value = orig_col_c.value
# Save test.xlsx only
# wb.save('Ambulance2Centroids_16622.xlsx')
wb2.save('test.xlsx')
I'm still fairly new to coding, so I'm sure that there are easier or prettier ways to write the following script. The script runs and the new sheets are created within the workbook, however, the data is not copying from 'sheet1' to the second sheet
I've tried googling and reading other threads on Stack Overflow, but none seem to answer the question
import os, csv, glob, shutil, pandas as pd, numpy as np, openpyxl as opyx
path_to_combined_file = c:\\somefilepath here\\
filepath = path_to_combined_file + 'NSW.xlsx'
unaided_brand_awareness = pd.read_excel(filepath)
from openpyxl import load_workbook
wb = load_workbook(filepath)
wb.create_sheet('unaided_brand_awareness')
wb.create_sheet('aided_brand_awareness')
wb.create_sheet('favourite_stations')
worksheet1 = wb['Sheet1']
worksheet2 = wb['unaided_brand_awareness']
for i in range (1,2000):
for j in range(1, worksheet1.max_column + 1):
worksheet2.cell(row = i, column = j).value = worksheet1.cell(row = i, column = j).value
wb.save(filepath)
The code SHOULD create the following sheets 'unaided brand awareness', 'aided brand awareness' and 'favourite stations' and then copy the data from the sheet titled 'Sheet1' to the sheet titled 'unaided brand awareness'
Ideally, it would be great to have the data from 'Sheet1' copied to all the sheets within the workbook.
Also, i probably should note that the number of cells contained within 'Sheet1' will differ from case to case.
I have a pandas dataframe and I want to open an existing excel workbook containing formulas, copying the dataframe in a specific set of columns (lets say from column A to column H) and save it as a new file with a different name.
The idea is to update an existing template, populate it with the dataframe in a specified set of column and then save a copy of the Excel file with a different name.
Any idea?
What I have is:
import pandas
from openpyxl import load_workbook
book = load_workbook('Template.xlsx')
writer = pandas.ExcelWriter('Template.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer)
writer.save()
The below should work, assuming that you are happy to copy into column A. I don't see a way to write into the sheet starting in a different column (without overwriting anything).
The below incorporates #MaxU's suggestion of copying the template sheet before writing to it (having just lost a few hours' work on my own template workbook to pd.to_excel)
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
from shutil import copyfile
template_file = 'Template.xlsx' # Has a header in row 1 already
output_file = 'Result.xlsx' # What we are saving the template as
# Copy Template.xlsx as Result.xlsx
copyfile(template_file, output_file)
# Read in the data to be pasted into the termplate
df = pd.read_csv('my_data.csv')
# Load the workbook and access the sheet we'll paste into
wb = load_workbook(output_file)
ws = wb.get_sheet_by_name('Existing Result Sheet')
# Selecting a cell in the header row before writing makes append()
# start writing to the following line i.e. row 2
ws['A1']
# Write each row of the DataFrame
# In this case, I don't want to write the index (useless) or the header (already in the template)
for r in dataframe_to_rows(df, index=False, header=False):
ws.append(r)
wb.save(output_file)
try this:
df.to_excel(writer, startrow=10, startcol=1, index=False, engine='openpyxl')
Pay attention at startrow and startcol parameters
I'm trying to write data into a cell, which has multiple line breaks (I believe \n), the resulting .xlsx has line breaks removed.
Is there a way to keep these line breaks?
The API for styles changed for openpyxl >= 2. The following code demonstrates the modern API.
from openpyxl import Workbook
from openpyxl.styles import Alignment
wb = Workbook()
ws = wb.active # wb.active returns a Worksheet object
ws['A1'] = "Line 1\nLine 2\nLine 3"
ws['A1'].alignment = Alignment(wrapText=True)
wb.save("wrap.xlsx")
Disclaimer: This won't work in recent versions of Openpyxl. See other answers.
In openpyxl you can set the wrap_text alignment property to wrap multi-line strings:
from openpyxl import Workbook
workbook = Workbook()
worksheet = workbook.worksheets[0]
worksheet.title = "Sheet1"
worksheet.cell('A1').style.alignment.wrap_text = True
worksheet.cell('A1').value = "Line 1\nLine 2\nLine 3"
workbook.save('wrap_text1.xlsx')
This is also possible with the XlsxWriter module.
Here is a small working example:
from xlsxwriter.workbook import Workbook
# Create an new Excel file and add a worksheet.
workbook = Workbook('wrap_text2.xlsx')
worksheet = workbook.add_worksheet()
# Widen the first column to make the text clearer.
worksheet.set_column('A:A', 20)
# Add a cell format with text wrap on.
cell_format = workbook.add_format({'text_wrap': True})
# Write a wrapped string to a cell.
worksheet.write('A1', "Line 1\nLine 2\nLine 3", cell_format)
workbook.close()
Just an additional option, you can use text blocking """ my cell info here """ along with the text wrap Boolean in alignment and get the desired result as well.
from openpyxl import Workbook
from openpyxl.styles import Alignment
wb= Workbook()
sheet= wb.active
sheet.title = "Sheet1"
sheet['A1'] = """Line 1
Line 2
Line 3"""
sheet['A1'].alignment = Alignment(wrapText=True)
wb.save('wrap_text1.xlsx')
Just in case anyone is looking for an example where we iterate over all cells to apply wrapping:
Small working example:
import pandas as pd
from openpyxl import Workbook
from openpyxl.styles import Alignment
from openpyxl.utils.dataframe import dataframe_to_rows
# create a toy dataframe. Our goal is to replace commas (',') with line breaks and have Excel rendering \n as line breaks.
df = pd.DataFrame(data=[["Mark", "Student,26 y.o"],
["Simon", "Student,31 y.o"]],
columns=['Name', 'Description'])
# replace comma "," with '\n' in all cells
df = df.applymap(lambda v: v.replace(',', '\n') if isinstance(v, str) else v)
# Create an empty openpyxl Workbook. We will populate it by iteratively adding the dataframe's rows.
wb = Workbook()
ws = wb.active # to get the actual Worksheet object
# dataframe_to_rows allows to iterate over a dataframe with an interface
# compatible with openpyxl. Each df row will be added to the worksheet.
for r in dataframe_to_rows(df3, index=True, header=True):
ws.append(r)
# iterate over each row and row's cells and apply text wrapping.
for row in ws:
for cell in row:
cell.alignment = Alignment(wrapText=True)
# export the workbook as an excel file.
wb.save("wrap.xlsx")