How to format a range of cells in python using xlsxwriter - python

How do I write a format to a range of cells.
What I am doing is looping over the column names in a list from oracle, and formatting the columns as dates, where the column name starts with "DT". But I also want to make the entire data range have borders.
I would like to really apply the date format to the columns, and then separately apply the borders...but the last format applies wins, and the application of the borders overwrites the date formatting on the columns.
Ideally I want to blast the data range with borders, and then apply date formats to the date columns, while retaining the borders.
Can you select a range and then apply formatting or do range intersections as you can in VBA?
# Generate EXCEL File
xl_filename = "DQ_Valid_Status_Check.xlsx"
workbook = xlsxwriter.Workbook(xl_filename)
# Add a bold format to use to highlight cells.
bold = workbook.add_format({'bold': True})
date_format = workbook.add_format(
{'num_format': 'dd-mmm-yyyy hh:mm:ss'})
border = workbook.add_format()
border.set_bottom()
border.set_top()
border.set_left()
border.set_right()
worksheet_info = workbook.add_worksheet()
worksheet_info.name = "Information"
worksheet_info.write('A1', 'Report Description:', bold)
worksheet_info.write('B1', 'ARIEL Data Quality Report for Checking Authorisation Status of Marketing Applications')
worksheet_info.write('A2', 'Report Date:', bold)
worksheet_info.write('B2', datetime.datetime.now(), date_format)
worksheet_data = workbook.add_worksheet()
worksheet_data.name = "DQ Report"
worksheet_data.write_row('A1', col_names)
for i in range(len(results)):
print("result " + str(i) + ' of' + str(len(results)))
print(results[i])
worksheet_data.write_row('A' + str(i + 2), results[i])
#worksheet_data.set_row(i + 2, None, border)
# add borders
for i in range(len(results)):
worksheet_data.set_row(i + 2, None, border)
# format date columns
for i in range(len(col_names)):
col_name = col_names[i]
if col_name.startswith("DT"):
print(col_name)
worksheet_data.set_column(i, i, None, date_format)
workbook.close()

According to the FAQ, it is not currently possible to format a range of cells at once, but a future feature might allow this.
You could create Format objects containing multiple format properties and apply your custom format to each cell as you write to it. See "Creating and using a Format Object".

To apply borders to all columns at once you can do something like:
border = workbook.add_format({'border':2})
worksheet_info.set_column(first_col=0, last_col=10, cell_format=border)
And to retain the border format you can modify your date_format to:
date_format = workbook.add_format(
{'num_format': 'dd-mmm-yyyy hh:mm:ss',
'border': 2})

Related

xlsxwriter conditional format works only after manually applying it

let me describe my issue below:
I've got got two excel worksheets, one containing past, the other - current data. They both have the following structure:
Col_1
Col_2
KEY
Col_3
Etc.
abc
xyz
key_1
foo
---
def
zyx
key_2
bar
---
Now, the goal is to check if a value for given key changed between the past and current iteration and if yes, color the given cell's background (in current data worksheet). This check has to be done for all the columns.
As the KEY column is not the very first one, I've decided to use XLOOKUP function and apply the formatting within the for loop. The full loop looks like this (in this example the KEY column is column C):
dark_blue = writer.book.add_format({'bg_color': '#3A67B8'})
old_sheet = "\'" + "old_" + "sheet_name" + "\'"
for col in range(last_col):
col_name = xl_col_to_name(col)
if col_name in unformatted_cols: # Not apply the formatting to certain columns
continue
else:
apply_range = '{0}1:{0}1048576'.format(col_name)
formula = "XLOOKUP(C1, {1}!C1:C1048576, {1}!{0}1:{0}1048576) <> XLOOKUP(C1, C1:C1048576, {0}1:{0}1048576)".format(col_name, old_sheet)
active_sheet.conditional_format(apply_range, {'type': 'formula',
'criteria': formula,
'format': dark_blue})
Now, my problem is that when I open the output the this conditional formatting doesn't work. If however I'll go to Conditional Formatting -> Manage Rules -> Edit Rule and without any editing I'll press OK and later apply it starts working correctly.
Does anyone know how to make this rule work properly without this manual intervention?
My all other conditional formatting rules, though simpler, work exactly as intended.
# This is the formula that I see in Python for the first loop iteration
=XLOOKUP(C1, 'old_sheet_name'!C1:C1048576, 'old_sheet_name'!A1:A1048576) <> XLOOKUP(C1, C1:C1048576, A1:A1048576)
# This formula I see in Excel for the same first column
=XLOOKUP(C1, 'old_sheet_name'!C:C, 'old_sheet_name'!A:A) <> XLOOKUP(C1, C:C, A:A)
The reason that XLOOKUPdoesn't work in your formula is that it is classified by Excel as a "Future Function", i.e, a function added after the original file format. In order to use it you need to prefix it with _xlfn.
This is explained in the XlsxWriter docs on Formulas added in Excel 2010 and later.
Here is a working example:
import xlsxwriter
workbook = xlsxwriter.Workbook('conditional_format.xlsx')
worksheet1 = workbook.add_worksheet('old_sheet_name')
worksheet2 = workbook.add_worksheet('new_sheet_name')
worksheet1.write(0, 0, 'Foo')
format1 = workbook.add_format({'bg_color': '#C6EFCE',
'font_color': '#006100'})
xlookup_formula = '=_xlfn.XLOOKUP(C1, old_sheet_name!C:C, old_sheet_name!A:A) <> _xlfn.XLOOKUP(C1, C:C, A:A)'
worksheet2.conditional_format('D1:D10',
{'type': 'formula',
'criteria': xlookup_formula,
'format': format1})
workbook.close()
Output:

Using gspread, trying to add a column at the end of Google Sheet that already exists

Here is the code I am working with.
dfs=dfs[['Reserved']] #the column that I need to insert
dfs=dfs.applymap(str) #json did not accept the nan so needed to convert
sh=gc.open_by_key('KEY') #would open the google sheet
sh_dfs=sh.get_worksheet(0) #getting the worksheet
sh_dfs.insert_rows(dfs.values.tolist()) #inserts the dfs into the new worksheet
Running this code would insert the rows at the first column of the worksheet but what I am trying to accomplish is adding/inserting the column at the very last, column p.
In your situation, how about the following modification? In this modification, at first, the maximum column is retrieved. And, the column number is converted to the column letter, and the values are put to the next column of the last column.
From:
sh_dfs.insert_rows(dfs.values.tolist())
To:
# Ref: https://stackoverflow.com/a/23862195
def colnum_string(n):
string = ""
while n > 0:
n, remainder = divmod(n - 1, 26)
string = chr(65 + remainder) + string
return string
values = sh_dfs.get_all_values()
col = colnum_string(max([len(r) for r in values]) + 1)
sh_dfs.update(col + '1', dfs.values.tolist(), value_input_option='USER_ENTERED')
Note:
If an error like exceeds grid limits occurs, please insert the blank column.
Reference:
update

Number format appears to be affecting border format in xlsxwriter

I'm new to Python, and attempting to automate a report at my workplace to save time, space, and trouble. The report runs just fine, and almost all of my code to write the results into an Excel document work as expected as well. However, these two formats:
percent = wb.add_format({'num_format': '0.0%','border':1,'border_color':'white'})
integer = wb.add_format({'num_format': '#,##0','border':1,'border_color':'white'})
are behaving oddly. When I run this:
i = 10
for lob in report.index.get_level_values(1).unique():
if report.loc[(program,lob)].sum().sum()==0:
pass
else:
place=report.loc[(program,lob)]
r=0
for year in place.index:
for item in range(8):
ws.write(i+r,item+2,place.loc[year][item],integer)
for item in range(9):
ws.write(i+r,item+10,place.loc[year][item+8],percent)
r+=1
for col_num, value in enumerate(report.columns.values):
ws.write(i-1, col_num + 2, value, headers)
ws.write(i-1,1,lob,lobtitle)
for row_num, year in enumerate(report.index.get_level_values(2).unique()):
ws.write(i+row_num,1,year,bold)
ws.set_row(i-1,40)
ws.set_row(i+7,None,bold)
i+=10
The first eight stats write in my "integer" format with white borders, but the next nine in the row write in percent format for the number, but with no border formatting at all (leaving the default Excel lines). In fact, throughout the report, anything I write with the "integer" format works out, and anything written with the "percent" format gives the correct number format without the border format:
The apparent simplicity of this issue is driving me crazy. Thanks for any help you can provide.
For reference, here's the full code. 'report' is a multi index dataframe with company programs as level 0, line of business (lob) as level 1, and the years 2015-2020 as level 2.
#Establish common formats
wb=xl.Workbook('Report.xlsx')
title=wb.add_format({'font_size':16,'font_name':'Calibri','align':'center','border':1,'border_color':'white'})
subtitle=wb.add_format({'font_size':14,'font_name':'Calibri','align':'center','border':1,'border_color':'white'})
blank=wb.add_format({'bg_color':'white'})
black=wb.add_format({'bg_color':'black'})
bold=wb.add_format({'bold':True,'border_color':'white'})
lobtitle=wb.add_format({'bold':True,'italic':True,'font_size':14})
wrap=wb.add_format({'text_wrap':True})
headers=wb.add_format({'bold':True,'text_wrap':True,'bg_color':'#DCDCDC','align':'center'})
percent = wb.add_format({'num_format': '0.0%','border':1,'border_color':'white'})
integer = wb.add_format({'num_format': '#,##0','border':1,'border_color':'white'})
shadepercent = wb.add_format({'num_format': '0.0%','border':1,'border_color':'white','bg_color':'#DCDCDC'})
shadeinteger = wb.add_format({'num_format': '#,##0','border':1,'border_color':'white','bg_color':'#DCDCDC'})
shadebold=wb.add_format({'bold':True,'border_color':'white','bg_color':'#DCDCDC'})
gridinteger=wb.add_format({'num_format': '#,##0','border':1,'border_color':'gray'})
gridpercent=percent = wb.add_format({'num_format': '0.0%','border':1,'border_color':'gray'})
#For every program, blank out all cells and add company title.
for program in report.index.get_level_values(0).unique():
ws=wb.add_worksheet(program)
for j in range(100):
ws.set_row(j,None,blank)
ws.set_column('B:S',15)
ws.write(0,10,'Company Title',title)
ws.write(1,10,'Report Name',subtitle)
ws.write(2,10,'as of {}'.format(effective),subtitle)
ws.write(3,10,'Detail',subtitle)
#Check each lob within a program for nonzero values. For each nonzero lob, write the lob's stats.
#Write the nonzero lob and its policy years from the index, and drop ten rows for the next entry.
i = 10
for lob in report.index.get_level_values(1).unique():
if report.loc[(program,lob)].sum().sum()==0:
pass
else:
place=report.loc[(program,lob)]
r=0
for year in place.index:
for item in range(8):
ws.write(i+r,item+2,place.loc[year][item],integer)
for item in range(9):
ws.write(i+r,item+10,place.loc[year][item+8],percent)
r+=1
for col_num, value in enumerate(report.columns.values):
ws.write(i-1, col_num + 2, value, headers)
ws.write(i-1,1,lob,lobtitle)
for row_num, year in enumerate(report.index.get_level_values(2).unique()):
ws.write(i+row_num,1,year,bold)
ws.set_row(i-1,40)
ws.set_row(i+7,None,bold)
i+=10
The sample code isn't complete enough to say what the issue is but there shouldn't be any issue with the formats as this example shows:
import xlsxwriter
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()
data1 = [1000, 1001, 1002]
data2 = [.35, .50, .75]
percent = workbook.add_format({'num_format': '0.0%', 'border': 1, 'border_color': 'white'})
integer = workbook.add_format({'num_format': '#,##0', 'border': 1, 'border_color': 'white'})
worksheet.write_column(2, 2, data1, integer)
worksheet.write_column(2, 4, data2, percent)
workbook.close()
Output:
As a guess the program may be overwriting the percent cells with another format but it isn't possible to tell without a complete working example.
Update, based on the update code from the OP:
From your updated code it looks like you are overwriting the percent format here:
gridpercent = percent = wb.add_format(...)
This resets the percent format.

Xlsxwriter - Dynamically change the formatting based on column label

I am trying to define the formatting that needs to be applied to each column of an excel spreadsheet based on the column name.
For example, if column name is 'count' then 'number_format' needs to be used. If column name is 'sale_date' then 'date_format' needs to be used.
number_format = workbook.add_format({'num_format': '0', 'font_size': 12})
date_format = workbook.add_format({'num_format': 'dd/mm/yyyy hh:mm:ss', 'font_size': 12})
Using the above two formats in the respective columns as shown below:
worksheet1.write('A1', 'count', number_format)
worksheet1.write('B1', 'sale_date', date_format)
Could I make this dynamic based on the column name instead of defining format by column label. Thanks
Update:
Loop that displays the header column in the excel spreadsheet
for data in title:
worksheet.write(row, col, data, number_format)
col += 1
Comment: date_format = workbook.add_format({'num_format': 'dd/mm/yy'}), shows the date column as unix number rather than a proper date.
Sample value shown is : 42668 instead of displaying "24-10-16".
This is default behavior defined by Windows Excel.
Read Excel for Windows stores dates by default as the number of days
Documentation: XlsxWriter Working with Dates and Time
Comment: ...that I could use the appropriate format based on column name (namely count, sale_date)
You can use worksheet.set_column() to set a Style for a whole Column.
Documentation: XlsxWriter worksheet.set_column()
Precondition: The Order of the Columns Name/Style must be in sync with your table.
E.g. count == 'A', sale_date == 'B' and so on...
from collections import OrderedDict
_styles = OrderedDict([('count',number_format), ('sale_date', date_format), ('total', number_format), ('text', string_format)])
for col, key in enumerate(_styles):
A1_notation = '{c}:{c}'.format(c=chr(col + 65))
worksheet.set_column(A1_notation, None, _styles[key])
print("worksheet.set_column('{}', None, {})".format(A1_notation, _styles[key]))
Output:
worksheet.set_column('A:A', None, number_format)
worksheet.set_column('B:B', None, date_format)
worksheet.set_column('C:C', None, number_format)
worksheet.set_column('D:D', None, string_format)
For subsequent writes you don't need to assign a style, e.g. use
worksheet.write('A1', 123)
will default to A:A number_format
Question: Could I make this dynamic based on the column name
You are not using "column name", it's called Cell A1 Notation.
Setup a mapping dict, for example:
style_map = {'A': number_format, 'B':date_format}
Usage:
Note: This will only work with single letter, from A to Z
def write(A1_notation, value):
worksheet1.write(A1_notation, value, style_map[A1_notation[0]])
For Row-column notation (0, 0):
style_map = {'0': number_format, '1':date_format}
Usage:
def write(row, col, value):
worksheet1.write(row, col, value, style_map[col])
from xlsxwriter.utility import xl_rowcol_to_cell
def write(A1_notation, value):
worksheet1.write(A1_notation, value, style_map[xl_cell_to_rowcol(A1_notation)[1]])

python openpyxl time insert without date

I have a value that I want to insert into excel as time formatted like HH:MM
If I use
cellFillTime = Style(fill = PatternFill(start_color=shiftColor,end_color=shiftColor,fill_type='solid'),
border = Border(left=Side(style='thin'),right=Side(style='thin'),top=Side(style='thin'),bottom=Side(style='thin')),
alignment=Alignment(wrap_text=True)
,number_format='HH:MM'
valM = 8
cellData = ws5.cell(column= a + 3, row= i+2, value=valM)
_cellStlyed = cellData.style = cellFillTime
I always get 1.1.1900 8:00:00 in excel worksheet
The problem I get is when I have SUM functions later and therefore they do not work.
How can I remove date from my cell when formatting cell to get only time values
thank you
best regards
This worked in terms of only hour insert
UPDATED CODE
cellFillTime = Style(fill = PatternFill(start_color=shiftColor,end_color=shiftColor,fill_type='solid'),
border = Border(left=Side(style='thin'),right=Side(style='thin'),top=Side(style='thin'),bottom=Side(style='thin')),
alignment=Alignment(wrap_text=True)
)
if rrr["rw_seqrec"] == 1 or rrr["rw_seqrec"] == 1001:
val_ = abs((rrr['rw_end'] - rrr['rw_start'])) / 60
#print "val_ ", val_
valM = datetime.datetime.strptime(str(val_), '%H').time()
cellData = ws5.cell(column= a + 3, row= i+2, value=valM)
cellData.style = cellFillTime
cellData.number_format='HH:MM'
the problem I have now is that excel still does not want to sum the time fields. It has smth to do with field format or smth.
any suggestions?
So the final catch was also adding the right time format for cells with hours and also to the cell that contains the SUM formula
_cell.number_format='[h]:mm;#'

Categories