I'm trying to write data into a cell, which has multiple line breaks (I believe \n), the resulting .xlsx has line breaks removed.
Is there a way to keep these line breaks?
The API for styles changed for openpyxl >= 2. The following code demonstrates the modern API.
from openpyxl import Workbook
from openpyxl.styles import Alignment
wb = Workbook()
ws = wb.active # wb.active returns a Worksheet object
ws['A1'] = "Line 1\nLine 2\nLine 3"
ws['A1'].alignment = Alignment(wrapText=True)
wb.save("wrap.xlsx")
Disclaimer: This won't work in recent versions of Openpyxl. See other answers.
In openpyxl you can set the wrap_text alignment property to wrap multi-line strings:
from openpyxl import Workbook
workbook = Workbook()
worksheet = workbook.worksheets[0]
worksheet.title = "Sheet1"
worksheet.cell('A1').style.alignment.wrap_text = True
worksheet.cell('A1').value = "Line 1\nLine 2\nLine 3"
workbook.save('wrap_text1.xlsx')
This is also possible with the XlsxWriter module.
Here is a small working example:
from xlsxwriter.workbook import Workbook
# Create an new Excel file and add a worksheet.
workbook = Workbook('wrap_text2.xlsx')
worksheet = workbook.add_worksheet()
# Widen the first column to make the text clearer.
worksheet.set_column('A:A', 20)
# Add a cell format with text wrap on.
cell_format = workbook.add_format({'text_wrap': True})
# Write a wrapped string to a cell.
worksheet.write('A1', "Line 1\nLine 2\nLine 3", cell_format)
workbook.close()
Just an additional option, you can use text blocking """ my cell info here """ along with the text wrap Boolean in alignment and get the desired result as well.
from openpyxl import Workbook
from openpyxl.styles import Alignment
wb= Workbook()
sheet= wb.active
sheet.title = "Sheet1"
sheet['A1'] = """Line 1
Line 2
Line 3"""
sheet['A1'].alignment = Alignment(wrapText=True)
wb.save('wrap_text1.xlsx')
Just in case anyone is looking for an example where we iterate over all cells to apply wrapping:
Small working example:
import pandas as pd
from openpyxl import Workbook
from openpyxl.styles import Alignment
from openpyxl.utils.dataframe import dataframe_to_rows
# create a toy dataframe. Our goal is to replace commas (',') with line breaks and have Excel rendering \n as line breaks.
df = pd.DataFrame(data=[["Mark", "Student,26 y.o"],
["Simon", "Student,31 y.o"]],
columns=['Name', 'Description'])
# replace comma "," with '\n' in all cells
df = df.applymap(lambda v: v.replace(',', '\n') if isinstance(v, str) else v)
# Create an empty openpyxl Workbook. We will populate it by iteratively adding the dataframe's rows.
wb = Workbook()
ws = wb.active # to get the actual Worksheet object
# dataframe_to_rows allows to iterate over a dataframe with an interface
# compatible with openpyxl. Each df row will be added to the worksheet.
for r in dataframe_to_rows(df3, index=True, header=True):
ws.append(r)
# iterate over each row and row's cells and apply text wrapping.
for row in ws:
for cell in row:
cell.alignment = Alignment(wrapText=True)
# export the workbook as an excel file.
wb.save("wrap.xlsx")
Related
I am trying to change the name of a sheet according to the value of a cell.
here is the code I am using.
from openpyxl import load_workbook
wb = load_workbook('file_name.xlsx')
ws = wb['Sheet 1']
sheet_name = ws['B2']
ws.title = f'Marketing {sheet_name}'
This code works, but
my problem is I only need to extract the first 3 characters from the cell ws['B2'].
How can I do that.
Use slicing:
sheet_name = ws['B2'].value[:3]
You can use the characters in the string.
Edit the parameters to get the desired result.
from openpyxl import load_workbook
wb = load_workbook('file_name.xlsx')
ws = wb['Sheet 1']
sheet_name = ws['B2'].value
ws.title = f'Marketing {sheet_name[0:3]}' # First character to third
Start reading the tabs by number so you don't have to change the tab name in the future.
Final code:
from openpyxl import load_workbook
wb = load_workbook('file_name.xlsx')
ws = wb.worksheets[0] # First tab
sheet_name = ws['B2'].value
ws.title = f'Marketing {sheet_name[0:3]}' # First character to third
Im using openpyxl to append formated dataframe rows to existing excel file/creating new with following code:
if os.path.isfile(transformed_file): #if file exists, load and append
workbook = openpyxl.load_workbook(transformed_file)
sheet = workbook['Sheet1']
for row in dataframe_to_rows(df, header=False, index=False):
sheet.append(row)
workbook.save(transformed_file)
workbook.close()
else: # create the excel file if doesn't already exist
with pd.ExcelWriter(path = transformed_file, engine = 'openpyxl') as writer:
df.to_excel(writer, index=False, sheet_name = 'Sheet1')
I need to format column 'G' as a plain number '0', at the moment when opening excel file the format is '1.23E+10'.
How could this be achieved for the sample above? Thank you!
Hello try the following code see if it works for you:
wb = Workbook()
ws = wb.active
ws['A1'] = 123455656565464563302589013
ws['B1'] = 123455656565464563302589013
ws['A1'].number_format = '0' # Number formatting
ws['B1'].number_format = '0.00E+00' # Scientific formatting
wb.save("formating_test.xlsx")
Found the solution which worked for me. Realized from documentation that one has to iterate through each cell.
for cell in sheet[('D')]:
cell.number_format ='0'
I have a Python code that, at the end of the process, it creates an Excel file with several worksheets, what I'm trying to do is copy a sheer from another file that is read with its exact format (cells with background color, different fonts and letter sizes, etc) and paste it as it is in the main file without affecting the other previously-created sheets, the method that I'm currently using doesn't allow me to do that because it overwrites the new file over the previously-created one. Does someone have a suggestion or way of doing this?
The method I'm currently using, which is obtained from: Read an excel file with Python and modify it without changing the style:
from openpyxl import Workbook, load_workbook
workbook2 = load_workbook("readme tab.xlsx") # Your Excel file
worksheet2 = workbook2.active # gets first sheet
for row in range(1, 10):
# Writes a new value PRESERVING cell styles.
worksheet2.cell(row=row, column=1, value=f'NEW VALUE {row}')
workbook2.save(path)
Reference of the code I'm using, in order:
import xlsxwriter
import pandas as pd
path = r"Archivo.xlsx"
writer = pd.ExcelWriter(path)
df1.to_excel(writer, sheet_name='Data')
workbook = writer.book
worksheet = writer.sheets['Data']
ws = workbook.add_worksheet('Graph')
worksheet.set_column(1, 29, 30)
writer.save()
from openpyxl import Workbook, load_workbook
workbook2 = load_workbook("readme tab.xlsx") # Your Excel file
worksheet2 = workbook2.active # gets first sheet
for row in range(1, 10):
# Writes a new value PRESERVING cell styles.
worksheet2.cell(row=row, column=1, value=f'NEW VALUE {row}')
workbook2.save(path)
You can copy a sheet (including its format style) with the method add_sheet.
And assuming that workbook is the workbook to which you're adding the sheet :
Replace :
workbook2 = load_workbook("readme tab.xlsx") # Your Excel file
worksheet2 = workbook2.active # gets first sheet
for row in range(1, 10):
# Writes a new value PRESERVING cell styles.
worksheet2.cell(row=row, column=1, value=f'NEW VALUE {row}')
workbook2.save(path)
By :
ws2 = load_workbook('readme tab.xlsx').active
ws2._parent = workbook
workbook._add_sheet(ws2)
workbook.save(path)
I'd like to read the values from column B in every worksheet within my workbook.
After a fair amount of reading and playing around I can return the cell names of the cells I want the values from, but I can't figure out how to get the values.
from openpyxl import load_workbook
wb = load_workbook(r"C:/Users/username/Documents/test.xlsx")
for sheet in wb.worksheets:
for row in range(2,sheet.max_row+1):
for column in "B":
cell_name = "{}{}".format(column, row)
print (cell_name)
This is returning the cell names (i.e. B2, B3) that have values in column B in every worksheet.
According to the documentation https://openpyxl.readthedocs.io/en/stable/usage.html you can access cell values as:
sheet['B5'].value
Replace B5 with the cell(s) you need.
import xlrd
loc = ("foo.xlsx") # excel file name
wb = xlrd.open_workbook(loc)
# sheet = wb.sheet_by_index(0)
for sheet in wb.sheets():
for i in range(sheet.nrows):
print(sheet.cell_value(i, 1))
Edit: I edited my answer to read all sheets in excel file.
just play with the range
from openpyxl import load_workbook
wb = load_workbook('')
for sheet in wb:
for i in range(1,50):
if sheet['B'+str(i)].value:
print(sheet['B'+str(i)].value)
Better one,
from openpyxl import load_workbook
wb = load_workbook('')
for sheet in wb:
for row in sheet['B']:
print(row.value)
I am trying to split only the merged cells in Excel file (with multiple sheets) that are like:
Please note that there are partially/fully empty rows. These rows are not merged.
Using openpyxl, I found the merged cell ranges in each sheet with this code:
wb2 = load_workbook('Example.xlsx')
sheets = wb2.sheetnames ##['Sheet1', 'Sheet2']
for i,sheet in enumerate(sheets):
ws = wb2[sheets[i]]
print(ws.merged_cell_ranges)
The print output:
['B3:B9', 'B13:B14', 'A3:A9', 'A13:A14', 'B20:B22', 'A20:A22']
['B5:B9', 'A12:A14', 'B12:B14', 'A17:A18', 'B17:B18', 'A27:A28', 'B27:B28', 'A20:A22', 'B20:B22', 'A3:A4', 'B3:B4', 'A5:A9']
Since I found the merged cell ranges, I need to split the ranges and fill in the corresponding rows like this:
How can I split like this using openpyxl? I am new to using this module. Any feedback is greatly appreciated!
You need to use the unmerge function. Example:
ws.unmerge_cells(start_row=2,start_column=1,end_row=2,end_column=4)
when you use unmerge_cells function, sheet.merged_cells.ranges will be modified, so don't use sheet.merged_cells.ranges in for loop.
from openpyxl.workbook import Workbook
from openpyxl import load_workbook
from openpyxl.utils.cell import range_boundaries
wb = load_workbook(filename = 'tmp.xlsx')
for st_name in wb.sheetnames:
st = wb[st_name]
mcr_coord_list = [mcr.coord for mcr in st.merged_cells.ranges]
for mcr in mcr_coord_list:
min_col, min_row, max_col, max_row = range_boundaries(mcr)
top_left_cell_value = st.cell(row=min_row, column=min_col).value
st.unmerge_cells(mcr)
for row in st.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
for cell in row:
cell.value = top_left_cell_value
wb.save('merged_tmp.xlsx')