Can Pandas to_excel support hyperlink style now? - python

I can't find an answer (or one I know how to implement) when it comes to using the excel "hyperlink" style for a column when exporting using pd.to_excel.
I can find plenty of (OLD) answers on using xlsxwriter or openpyxl. But none using the current pandas functionality.
I think it might be possible now with the updates to the .style function? But I don't know how to implement the CSS2.2 rules to emulate the hyperlink style.
import pandas as pd
df = pd.DataFrame({'ID':1, 'link':['=HYPERLINK("http://www.someurl.com", "some website")']})
df.to_excel('test.xlsx')
The desired output is for the link column, to be the standard blue underlined text that then turns purple once you have clicked the link.
Is there a way to use the built in excel styling? Or would you have to pass various css properties througha dictionary using .style?

Here is one way to do it using xlsxwriter as the Excel engine:
import pandas as pd
df = pd.DataFrame({'ID': [1, 2],
'link':['=HYPERLINK("http://www.python.org", "some website")',
'=HYPERLINK("http://www.python.org", "some website")']})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter objects from the dataframe writer object.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Get the default URL format.
url_format = workbook.get_default_url_format()
# Apply it to the appropriate column, and widen the column.
worksheet.set_column(2, 2, 40, url_format)
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output, note that the second link has been clicked and is a different color:
Note, it would be preferable to use the xlsxwriter worksheet.write_url() method since that will look like a native Excel url to the end user and also doesn't need the above trick of getting and applying the url format. However, that method can't be used directly from a pandas dataframe (unlike the formula) so you would need to iterate through the link column of the dataframe and overwrite the formulas programatically with actual links.
Something like this:
import pandas as pd
df = pd.DataFrame({'ID': [1, 2],
'link':['=HYPERLINK("http://www.python.org", "some website")',
'=HYPERLINK("http://www.python.org", "some website")']})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('test2.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the worksheet handle.
worksheet = writer.sheets['Sheet1']
# Widen the colum for clarity
worksheet.set_column(2, 2, 40)
# Overwrite the urls
worksheet.write_url(1, 2, "http://www.python.org", None, "some website")
worksheet.write_url(2, 2, "http://www.python.org", None, "some website")
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:

Related

Write a pandas dataframe into an existing excel file [duplicate]

I am having trouble updating an Excel Sheet using pandas by writing new values in it. I already have an existing frame df1 that reads the values from MySheet1.xlsx. so this needs to either be a new dataframe or somehow to copy and overwrite the existing one.
The spreadsheet is in this format:
I have a python list: values_list = [12.34, 17.56, 12.45]. My goal is to insert the list values under Col_C header vertically. It is currently overwriting the entire dataframe horizontally, without preserving the current values.
df2 = pd.DataFrame({'Col_C': values_list})
writer = pd.ExcelWriter('excelfile.xlsx', engine='xlsxwriter')
df2.to_excel(writer, sheet_name='MySheet1')
workbook = writer.book
worksheet = writer.sheets['MySheet1']
How to get this end result? Thank you!
Below I've provided a fully reproducible example of how you can go about modifying an existing .xlsx workbook using pandas and the openpyxl module (link to Openpyxl Docs).
First, for demonstration purposes, I create a workbook called test.xlsx:
from openpyxl import load_workbook
import pandas as pd
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
wb = writer.book
df = pd.DataFrame({'Col_A': [1,2,3,4],
'Col_B': [5,6,7,8],
'Col_C': [0,0,0,0],
'Col_D': [13,14,15,16]})
df.to_excel(writer, index=False)
wb.save('test.xlsx')
This is the Expected output at this point:
In this second part, we load the existing workbook ('test.xlsx') and modify the third column with different data.
from openpyxl import load_workbook
import pandas as pd
df_new = pd.DataFrame({'Col_C': [9, 10, 11, 12]})
wb = load_workbook('test.xlsx')
ws = wb['Sheet1']
for index, row in df_new.iterrows():
cell = 'C%d' % (index + 2)
ws[cell] = row[0]
wb.save('test.xlsx')
This is the Expected output at the end:
In my opinion, the easiest solution is to read the excel as a panda's dataframe, and modify it and write out as an excel. So for example:
Comments:
Import pandas as pd.
Read the excel sheet into pandas data-frame called.
Take your data, which could be in a list format, and assign it to the column you want. (just make sure the lengths are the same). Save your data-frame as an excel, either override the old excel or create a new one.
Code:
import pandas as pd
ExcelDataInPandasDataFrame = pd.read_excel("./YourExcel.xlsx")
YourDataInAList = [12.34,17.56,12.45]
ExcelDataInPandasDataFrame ["Col_C"] = YourDataInAList
ExcelDataInPandasDataFrame .to_excel("./YourNewExcel.xlsx",index=False)

Use data from a different sheet for table using xlsxwriter

I am using xlsxwriter to generate a file with quite a few formulas. From there, I want to create a table on another sheet. Everything is pretty straightforward until I want to use data from a different sheet for the table.
The documentation only shows examples of already having the data you need, and then passing that to the .add_table as the 'data' parameter.
What I am trying to do is this: (Which is structured how the rest of xlsxwriter's formulas are.)
df = pd.DataFrame(stuff)
writer = pd.ExcelWriter('File.xlsx', engine = 'xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook = writer.book
worksheet1 = writer.sheets['Sheet1']
worksheet2 = workbook.add_worksheet('Summary Page')
data = f"'Sheet1'!$A$1:$D${len(df)}"
worksheet2.add_table(f'A1:D{len(df)}', {'data':data})
workbook.close()
This approach adds the new sheet, and creates a table the correct size. But then fills in the "data" with 'data' as a string down the first column with one character in each cell.
Is there a way to create a table referencing data from another sheet using xlsxwriter?
ExcelWriter is (obviously) for writing Excelfiles.
If you want to read data from Excel after writing and saving it (did I get you right?!) use
ExcelFile.parse or read_excel to convert data to dataframe and write it again to Excel by ExcelWriter. Unfortunately xlsxwriter does not support appending, so you have to load and write all sheets again. Or just use the default openpyxl as engine. Could be omitted (as said: default) but to point out it is given in minimal working example:
import pandas as pd
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
data = pd.read_excel('test.xlsx', usecols='A:B', sheet_name='Sheet1', index_col=0)
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a')
# shape our data here
data.to_excel(writer, sheet_name='Sheet2')
writer.save()

pandas ExcelWriter write sheet direction right to left

I'm creating in python Django dynamic Excel file that is downloaded (not stored locally), and I need to make the sheet display from right to left and not from left to right.
Here is the code that I used:
import pandas
from io import BytesIO, StringIO
sio = BytesIO()
PandasDataFrame = pandas.DataFrame([['t1', 't2'], ['t3', 't4']], index = ['t1', 't2'], columns = ['t11', 't1'] )
PandasWriter = pandas.ExcelWriter(sio, engine='xlsxwriter')
PandasDataFrame.to_excel(PandasWriter, sheet_name='Sheet1')
PandasWriter.book.add_format({'reading_order': 2})
PandasWriter.save()
PandasWriter.close()
sio.seek(0)
workbook = sio.read()
response = HttpResponse(workbook,content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=wrong_data.xlsx'
return (response)
Here is a simple Pandas example that demonstrates how to change the text, and also the worksheet direction. You can convert it to a Django example yourself.
# _*_ coding: utf-8
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [u'نص عربي / English text'] * 6})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Add the cell formats.
format_right_to_left = workbook.add_format({'reading_order': 2})
# Change the direction for the worksheet.
worksheet.right_to_left()
# Make the column wider for visibility and add the reading order format.
worksheet.set_column('B:B', 30, format_right_to_left)
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
See the XlsxWriter docs on the worksheet right_to_left() method for more details.

Pandas and xlsxwriter: how to create a new sheet without exporting a dataframe?

If I call the xlsxwriter module directly, it is very easy to create a new sheet in a new file, and write to its cells, e.g.:
import xlsxwriter
workbook = xlsxwriter.Workbook('test 1.xlsx')
wks1=workbook.add_worksheet('Test sheet')
wks1.write(0,0,'some random text')
workbook.close()
Howeer, my question is: how can I create a new sheet using a Pandas.ExcelWriter object? The object can create a new sheet when exporting a dataframe, but what if I don't have any dataframes to export?
E.g. say I have exported 4 dataframes to 4 separate sheets, and now I just want to write some text to a new sheet. The only solution I have found is to create an empty dataframe, export that (which creates the new sheet), then write to the sheet:
import pandas as pd
writer = pd.ExcelWriter('test 2 .xlsx', engine='xlsxwriter')
df=pd.DataFrame()
df.to_excel(writer, 'new sheet', index=False, startrow=0,startcol=0)
writer.sheets['new sheet'].write(0,0,'some random text')
writer.close()
Is there another way? add_worksheet() seems to be a method of the workbook class only, not of ExcelWriter
I don't see anything wrong with the way you are doing it but you could also use the XlsxWriter workbook object from the ExcelWriter as follows:
writer = pd.ExcelWriter('test 2 .xlsx', engine='xlsxwriter')
workbook = writer.book
worksheet = workbook.add_worksheet('new sheet')
worksheet.write(0, 0, 'some random text')

Insert pandas DataFrame into existing excel worksheet with styling

I've seen answers as to how to add a pandas DataFrame into an existing worksheet using openpyxl as shown below:
from openpyxl import load_workbook, Workbook
import pandas as pd
df = pd.DataFrame(data=["20-01-2018",4,9,16,25,36],columns=["Date","A","B","C","D","E"])
path = 'filepath.xlsx'
writer = pd.ExcelWriter(path, engine='openpyxl')
writer.book = load_workbook(path)
writer.sheets = dict((ws.title,ws) for ws in writer.book.worksheets)
df.to_excel(writer,sheet_name="Sheet1", startrow=2,index=False, header=False)
writer.save()
However, I need to set a highlight color to the background data. Is there a way to do this without changing the dataframe into a list - trying to maintain the date format too.
Thanks
You can create a function to do the highlighting in the cells you desire
def highlight_style():
# provide your criteria for highlighting the cells here
return ['background-color: red']
And then apply your highlighting function to your dataframe...
df.style.apply(highlight_style)
After this when you write it to an excel it should work as you want =)
I sorted it thanks to help from Andre. You can export the results as such:
df.style.set_properties(**{'background-color':'red'}).to_excel(writer,sheet_name="Sheet1", startrow=2,index=False, header=False)
writer.save()
Thanks!

Categories