Using xlsxwriter to Align Left a Row - python

xlsxwriter has been pretty powerful and almost everything I want is working, but the following attempt to align left a single row doesn't seem to work.
stats = DataFrame(...)
xl_writer = ExcelWriter(r'U:\temp\test.xlsx')
stats.to_excel(xl_writer, 'Stats')
workbook = xl_writer.book
format_header = workbook.add_format({'align': 'left'})
stats_sheet = xl_writer.sheets['Stats']
stats_sheet.set_row(0, None, format_header)

See the XlsxWriter docs for Formatting of the Dataframe headers:
Pandas writes the dataframe header with a default cell format. Since it is a cell format it cannot be overridden using set_row(). If you wish to use your own format for the headings then the best approach is to turn off the automatic header from Pandas and write your own. For example...

Related

When reading excel files with pandas, what determines the datatype of the cells being read?

I am reading an excel sheet and plucking data from rows containing the given PO.
import pandas as pd
xlsx = pd.ExcelFile('Book2.xlsx')
df = pd.read_excel(xlsx)
PO_arr = ['121121','212121']
for i in PO_arr:
PO = i
PO_DATA = df.loc[df['PONUM'] == PO]
for i in range(1, max(PO_DATA['POLINENUM'].values) +1):
When I take this Excel sheet straight from its source, my code works fine. But when I cut out only the rows I want and paste them to a new spreadsheet with the exact same formatting and read this new spreadsheet, I have to change PO_DATA to look for an integer instead of a string as such:
PO_DATA = df.loc[df['PONUM'] == int(PO)]
If not, I get an error, and calling PO_DATA returns an empty dataframe.
C:\...\pandas\core\ops\array_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
res_values = method(rvalues)
I checked the cell formatting in Excel and in both cases, they are formatted as 'General' cells.
What is going on that makes it so when I chop up my spreadsheet, I have to look for an integer and not a string? What do I have to do to make it work for sheets I've created and pasted relevant data into instead of only sheets from the source?
Excel can do some funky formatting when copy and paste is used: ctl-c : ctl-v.
I am sure you tried these but...
A) Try copy ctl-c then ctl-alt-v:"v":enter ... on new sheet/file
B) Try using the format painter in Excel : Looks like a paintbrush on the home tab - select the properly formatted cells first - double click format painter - move to your new file/sheet - select cells you want the format to conform to.
C) Select your new file/table you pasted into - select purple eraser icon from the top options in excel - clear all formats
Update: I found an old related thread that didn't necessarily answer the question but solved the problem.
you can force pandas to import values as a certain datatype when reading from excel using the converters argument for read_excel.
df = pd.read_excel(xlsx, converters={'POLINENUM':int,'PONUM':int})

openpyxl how to set cell format as Date instead of Custom

I am using openpyxl and pandas to generate an Excel file, and need to have dates formatted as Date in Excel. The dates in exported file are formatted correctly in dd/mm/yyyy format but when I right-click on a cell and go to 'Format Cells' it shows Custom, is there a way to change to Date? Here is my code where I specify date format.
writer = pd.ExcelWriter(dstfile, engine='openpyxl', date_format='dd/mm/yyyy')
I have also tried to set cell.number_format = 'dd/mm/yyyy' but still getting Custom format in Excel.
The answer can be found in the comments of Converting Data to Date Type When Writing with Openpyxl.
ensure you are writing a datetime.datetime object to the cell, then:
.number_format = 'mm/dd/yyyy;#' # notice the ';#'
e.g.,
import datetime
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
ws['A1'] = datetime.datetime(2021, 12, 25)
ws['A1'].number_format = 'yyyy-mm-dd;#'
wb.save(r'c:\data\test.xlsx')
n.b. these dates are still a bit 'funny' as they are not auto-magically grouped into months and years in pivot tables (if you like that sort of thing). In the pivot table, you can manually click on them and set the grouping though: https://support.microsoft.com/en-us/office/group-or-ungroup-data-in-a-pivottable-c9d1ddd0-6580-47d1-82bc-c84a5a340725
You might have to convert them to datetime objects in python if they are saved as strings in the data frame. One approach is to iterate over the cells and doing it after using ExcelWriter:
cell = datetime.strptime('30/12/1999', '%d/%m/%Y')
cell.number_format = 'dd/mm/yyyy'
A better approach is to convert that column in the data frame prior to that. You can use to_datetime function in Pandas for that.
See this answer for converting the whole column in the dataframe.

How do I ensure 'read_excel' on Pandas reads the correct sheet?

The following piece of code is getting the data from Excel in the 5th row and the 14th row:
import pandas as pd
import pymssql
df=[]
fp = "G:\\Data\\Hotels\\ABZPD - Daily Strategy Tool.xlsm"
data = pd.read_excel(fp,sheet_name ="CRM View" )
row_date = data.loc[2, :]
row_sita = "ABZPD"
row_event = data.iloc[11, :]
df = pd.DataFrame({'date': row_date,
'sita': row_sita,
'event': row_event
})
print(df)
However, it is not actually using the worksheet I need it to. Instead of using "CRM View" (like I told it to!) it is using the worksheet "Previous CRM View". I assume this is because both worksheets have similar names.
So the question is, how do I get it to use the one that is called "CRM View"?
I was able to reproduce your problem. It didn't seem like it was about that the supplied sheet name is similar, it just read the first sheet in the file no matter what you put sheet_name to.
Anyway, It seemed like a bug so I checked what version of pandas I was running, which was 0.20.3. After updating to 0.22.0 the problem was gone and the right sheet was selected.
Edit: this was apparently a known bug in 0.20.3.

How to apply conditional formatting in openpyxl?

I am using openpyxl to manipulate a Microsoft Excel Worksheet.
What I want to do is to add a Conditional Formatting Rule that fills the rows with a given colour if the row number is even, leaves the row blank if not.
In Excel this can be done by selecting all the worksheet, creating a new formatting rule with the text =MOD(ROW();2)=0 or =EVEN(ROW()) = ROW().
I tried to implement this behaviour with the following lines of code (considering for example the first 10 rows):
redFill = PatternFill(start_color='EE1111', end_color='EE1111', fill_type='solid')
ws2.conditional_formatting.add('A1:A10', FormulaRule(formula=['MOD(ROW();2) = 0'], stopIfTrue=False, fill=redFill))
My program runs correctly but when I try to open the output Excel file, it tells me that the file contains unreadable content and it asks me if I want to recover the worksheet content. By clicking yes, the worksheet is what I expect but there is no formatting.
What is the correct way to apply such a formatting in openpyxl (possibly to the entire worksheet)?
Unfortunately, the way formulae are handled in conditional formatting is particularly opaque. The best thing to do is to create a file with the relevant conditional format and inspect the relevant file by unzipping it. The rules are stored in the relevant worksheet files and the formats in the styles file.
However, I suspect that the problem may simply because you are using ";" to separate parameters in the function: you must always use commas for this.
A sample formula from one of my projects:
green_text = Font(color="006100")
green_fill = PatternFill(bgColor="C6EFCE")
dxf2 = DifferentialStyle(font=green_text, fill=green_fill)
r3 = Rule(type="expression", dxf=dxf2)
r3.formula = ["AND(ISNUMBER(C2), C2>=400)"]

how to combine merge_range and write_formula with xlsxwriter python

this is the code that i'm trying. but its not working.
import xlsxwriter
....
sheet.merge_range.write_formula('F16:H16', """IF('Original data'!B4<>"",'Original data'!B4,"")""", center)
Is there another code that can put both of them become one? i'm already doing some research and don't get any. thanks in advance
From the docs on merge_range():
The merge_range() method writes its data argument using write(). Therefore it will handle numbers, strings and formulas as usual. If this doesn’t handle your data correctly then you can overwrite the first cell with a call to one of the other write_*() methods using the same Format as in the merged cells. See Example: Merging Cells with a Rich String.
Here is a small working example based on yours:
import xlsxwriter
workbook = xlsxwriter.Workbook('example.xlsx')
worksheet1 = workbook.add_worksheet()
worksheet2 = workbook.add_worksheet('Original data')
center = workbook.add_format({'align': 'center', 'fg_color': 'yellow'})
worksheet1.merge_range('F16:H16',
"""=IF('Original data'!B4<>"",'Original data'!B4,"")""",
center)
worksheet2.write('B4', 'Hello')
workbook.close()
Output:

Categories