pandas to_html using the .style options or custom CSS? - python

I was following the style guide for pandas and it worked pretty well.
How can I keep these styles using the to_html command through Outlook? The documentation seems a bit lacking for me.
(df.style
.format(percent)
.applymap(color_negative_red, subset=['col1', 'col2'])
.set_properties(**{'font-size': '9pt', 'font-family': 'Calibri'})
.bar(subset=['col4', 'col5'], color='lightblue'))
import win32com.client as win32
outlook = win32.Dispatch('outlook.application')
mail = outlook.CreateItem(0)
mail.Subject = subject_name
mail.HTMLbody = ('<html><body><p><body style="font-size:11pt;
font-family:Calibri">Hello,</p> + '<p>Title of Data</p>' + df.to_html(
index=False, classes=????????) '</body></html>')
mail.send
The to_html documentation shows that there is a classes command that I can put inside of the to_html method, but I can't figure it out. It also seems like my dataframe does not carry the style that I specified up top.
If I try:
df = (df.style
.format(percent)
.applymap(color_negative_red, subset=['col1', 'col2'])
.set_properties(**{'font-size': '9pt', 'font-family': 'Calibri'})
.bar(subset=['col4', 'col5'], color='lightblue'))
Then df is now a Style object and you can't use to_html.
Edit - this is what I am currently doing to modify my tables. This works, but I can't keep the cool features of the .style method that pandas offers.
email_paragraph = """
<body style= "font-size:11pt; font-family:Calibri; text-align:left; margin: 0px auto" >
"""
email_caption = """
<body style= "font-size:10pt; font-family:Century Gothic; text-align:center; margin: 0px auto" >
"""
email_style = '''<style type="text/css" media="screen" style="width:100%">
table, th, td {border: 0px solid black; background-color: #eee; padding: 10px;}
th {background-color: #C6E2FF; color:black; font-family: Tahoma;font-size : 13; text-align: center;}
td {background-color: #fff; padding: 10px; font-family: Calibri; font-size : 12; text-align: center;}
</style>'''

Once you add style to your chained assignments you are operating on a Styler object. That object has a render method to get the html as a string. So in your example, you could do something like this:
html = (
df.style
.format(percent)
.applymap(color_negative_red, subset=['col1', 'col2'])
.set_properties(**{'font-size': '9pt', 'font-family': 'Calibri'})
.bar(subset=['col4', 'col5'], color='lightblue')
.render()
)
Then include the html in your email instead of a df.to_html().

It's not an extravagant / pythonic solution. I inserted the link to a direct css file before the html code made by the to_html () method, then I saved the whole string as an html file. This worked well for me.
dphtml = r'<link rel="stylesheet" type="text/css" media="screen" href="css-table.css" />' + '\n'
dphtml += dp.to_html()
with open('datatable.html','w') as f:
f.write(dphtml)

Selecting the table (the rendered, styled, dataframe widgets in jupyter) and copy-pasting to an email body worked for me (using outlook office).
No manual html extraction, saving, loading, or anything like that.

Related

How to convert pandas dataframe into pdf using pandas package [duplicate]

What is an efficient way to generate PDF for data frames in Pandas?
First plot table with matplotlib then generate pdf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
df = pd.DataFrame(np.random.random((10,3)), columns = ("col 1", "col 2", "col 3"))
#https://stackoverflow.com/questions/32137396/how-do-i-plot-only-a-table-in-matplotlib
fig, ax =plt.subplots(figsize=(12,4))
ax.axis('tight')
ax.axis('off')
the_table = ax.table(cellText=df.values,colLabels=df.columns,loc='center')
#https://stackoverflow.com/questions/4042192/reduce-left-and-right-margins-in-matplotlib-plot
pp = PdfPages("foo.pdf")
pp.savefig(fig, bbox_inches='tight')
pp.close()
reference:
How do I plot only a table in Matplotlib?
Reduce left and right margins in matplotlib plot
Here is how I do it from sqlite database using sqlite3, pandas and pdfkit
import pandas as pd
import pdfkit as pdf
import sqlite3
con=sqlite3.connect("baza.db")
df=pd.read_sql_query("select * from dobit", con)
df.to_html('/home/linux/izvestaj.html')
nazivFajla='/home/linux/pdfPrintOut.pdf'
pdf.from_file('/home/linux/izvestaj.html', nazivFajla)
Well one way is to use markdown. You can use df.to_html(). This converts the dataframe into a html table. From there you can put the generated html into a markdown file (.md) (see http://daringfireball.net/projects/markdown/basics). From there, there are utilities to convert markdown into a pdf (https://www.npmjs.com/package/markdown-pdf).
One all-in-one tool for this method is to use Atom text editor (https://atom.io/). There you can use an extension, search "markdown to pdf", which will make the conversion for you.
Note: When using to_html() recently I had to remove extra '\n' characters for some reason. I chose to use Atom -> Find -> '\n' -> Replace "".
Overall this should do the trick!
With reference to these two examples that I found useful:
Apply CSS class to Pandas DataFrame using to_html
https://pbpython.com/pdf-reports.html
The simple CSS code saved in same folder as ipynb:
/* includes alternating gray and white with on-hover color */
.mystyle {
font-size: 11pt;
font-family: Arial;
border-collapse: collapse;
border: 1px solid silver;
}
.mystyle td, th {
padding: 5px;
}
.mystyle tr:nth-child(even) {
background: #E0E0E0;
}
.mystyle tr:hover {
background: silver;
cursor: pointer;
}
The python code:
pdf_filepath = os.path.join(folder,file_pdf)
demo_df = pd.DataFrame(np.random.random((10,3)), columns = ("col 1", "col 2", "col 3"))
table=demo_df.to_html(classes='mystyle')
html_string = f'''
<html>
<head><title>HTML Pandas Dataframe with CSS</title></head>
<link rel="stylesheet" type="text/css" href="df_style.css"/>
<body>
{table}
</body>
</html>
'''
HTML(string=html_string).write_pdf(pdf_filepath, stylesheets=["df_style.css"])
This is a solution with an intermediate pdf file.
The table is pretty printed with some minimal css.
The pdf conversion is done with weasyprint. You need to pip install weasyprint.
# Create a pandas dataframe with demo data:
import pandas as pd
demodata_csv = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv'
df = pd.read_csv(demodata_csv)
# Pretty print the dataframe as an html table to a file
intermediate_html = '/tmp/intermediate.html'
to_html_pretty(df,intermediate_html,'Iris Data')
# if you do not want pretty printing, just use pandas:
# df.to_html(intermediate_html)
# Convert the html file to a pdf file using weasyprint
import weasyprint
out_pdf= '/tmp/demo.pdf'
weasyprint.HTML(intermediate_html).write_pdf(out_pdf)
# This is the table pretty printer used above:
def to_html_pretty(df, filename='/tmp/out.html', title=''):
'''
Write an entire dataframe to an HTML file
with nice formatting.
Thanks to #stackoverflowuser2010 for the
pretty printer see https://stackoverflow.com/a/47723330/362951
'''
ht = ''
if title != '':
ht += '<h2> %s </h2>\n' % title
ht += df.to_html(classes='wide', escape=False)
with open(filename, 'w') as f:
f.write(HTML_TEMPLATE1 + ht + HTML_TEMPLATE2)
HTML_TEMPLATE1 = '''
<html>
<head>
<style>
h2 {
text-align: center;
font-family: Helvetica, Arial, sans-serif;
}
table {
margin-left: auto;
margin-right: auto;
}
table, th, td {
border: 1px solid black;
border-collapse: collapse;
}
th, td {
padding: 5px;
text-align: center;
font-family: Helvetica, Arial, sans-serif;
font-size: 90%;
}
table tbody tr:hover {
background-color: #dddddd;
}
.wide {
width: 90%;
}
</style>
</head>
<body>
'''
HTML_TEMPLATE2 = '''
</body>
</html>
'''
Thanks to #stackoverflowuser2010 for the pretty printer, see stackoverflowuser2010's answer https://stackoverflow.com/a/47723330/362951
I did not use pdfkit, because I had some problems with it on a headless machine. But weasyprint is great.
when using Matplotlib, here's how to get a prettier table with alternating colors for the rows, etc. as well as to optionally paginate the PDF:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
def _draw_as_table(df, pagesize):
alternating_colors = [['white'] * len(df.columns), ['lightgray'] * len(df.columns)] * len(df)
alternating_colors = alternating_colors[:len(df)]
fig, ax = plt.subplots(figsize=pagesize)
ax.axis('tight')
ax.axis('off')
the_table = ax.table(cellText=df.values,
rowLabels=df.index,
colLabels=df.columns,
rowColours=['lightblue']*len(df),
colColours=['lightblue']*len(df.columns),
cellColours=alternating_colors,
loc='center')
return fig
def dataframe_to_pdf(df, filename, numpages=(1, 1), pagesize=(11, 8.5)):
with PdfPages(filename) as pdf:
nh, nv = numpages
rows_per_page = len(df) // nh
cols_per_page = len(df.columns) // nv
for i in range(0, nh):
for j in range(0, nv):
page = df.iloc[(i*rows_per_page):min((i+1)*rows_per_page, len(df)),
(j*cols_per_page):min((j+1)*cols_per_page, len(df.columns))]
fig = _draw_as_table(page, pagesize)
if nh > 1 or nv > 1:
# Add a part/page number at bottom-center of page
fig.text(0.5, 0.5/pagesize[0],
"Part-{}x{}: Page-{}".format(i+1, j+1, i*nv + j + 1),
ha='center', fontsize=8)
pdf.savefig(fig, bbox_inches='tight')
plt.close()
Use it as follows:
dataframe_to_pdf(df, 'test_1.pdf')
dataframe_to_pdf(df, 'test_6.pdf', numpages=(3, 2))
Explanation of the code is here:
https://levelup.gitconnected.com/how-to-write-a-pandas-dataframe-as-a-pdf-5cdf7d525488

Python add text to a HTML table file file generated with to_html() method

please I have a question that is probably an easy one especially for those of you expert of HTML.
I basically have a python pandas dataframe 'df' and I convert it to a HTML document using the useful method:
html = df.to_html()
text_file = open('example.html', "w")
text_file.write(html)
text_file.close()
The problem I face is that I would need to add a paragraph (a simple sentence) before the table.
I tried to add the following code to my script:
title = """<head>
<title>Test title</title>
</head>
"""
html = html.replace('<table border="1" class="dataframe">', title + '<table border="1" class="dataframe">')
but it doesn't seem to do anything, plus in reality what I would need to add is not a title but a string containing the paragraph information.
Does anybody have a simple suggestion that doesn't involve using beautiful soup or other libraries?
Thank you.
This code does pretty much what I needed:
html = df.to_html()
msg = "custom mesagges"
title = """
<html>
<head>
<style>
thead {color: green;}
tbody {color: black;}
tfoot {color: red;}
table, th, td {
border: 1px solid black;
}
</style>
</head>
<body>
<h4>
""" + msg + "</h4>"
end_html = """
</body>
</html>
"""
html = title + html + end_html
text_file = open(file_name, "w")
text_file.write(html)
text_file.close()
You should consider using dominate. You can build html elements and combine raw html. As a proof of concept:
from dominate.tags import *
from dominate.util import raw
head_title = 'test' # Replace this with whatever content you like
raw_html_content = '<table border="1" class="dataframe"></table>' # Replace this with df.to_html()
print(html(head(title(head_title)), body(raw(raw_html_content))))
This will output:
<html>
<head>
<title>test</title>
</head>
<body><table border="1" class="dataframe"></table> </body>
</html>
Alternatively you can build the html with BeauitfulSoup. It a lot more powerful, but then you have to write a lot more code.
from bs4 import BeautifulSoup
raw_html_content = '<table border="1" class="dataframe"></table> '
some_content = 'TODO click here'
soup = BeautifulSoup(raw_html_content, features='html.parser') # This would contain the table
paragraph = soup.new_tag('p') # To add content wrapped in p tag under table
paragraph.append(BeautifulSoup(some_content, features='html.parser'))
soup.append(paragraph)
print(soup.prettify())
This will output:
<table border="1" class="dataframe">
</table>
<p>
TODO
<a href="#">
click here
</a>
</p>
You can use python built in f-string to add replacement fields with variables. Simply add the character f at the start of the string and then pass in the variable wrapped in brace brackets. This makes the html easier to read and edit. The downside is that to display brace brackets within the content, you have to use double brace brackets (see thead below).
An example e.g:
main_content = '<table border="1" class="dataframe"></table>' # // df.to_html()
msg = "custom messages"
html = f"""
<html>
<head>
<style>
thead {{color: green;}}
tbody {{color: black;}}
tfoot {{color: red;}}
table, th, td {{
border: 1px solid black;
}}
</style>
</head>
<body>
<h4>{msg}</h4>
{main_content}
</body>
</html>
"""
print(html)
This will output:
<html>
<head>
<style>
thead {color: green;}
tbody {color: black;}
tfoot {color: red;}
table, th, td {
border: 1px solid black;
}
</style>
</head>
<body>
<h4>custom mesagges</h4>
<table border="1" class="dataframe"></table>
</body>
</html>

Why do I get a different result every time I save and extract a response string from a web service?

My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.
NOTE: the suggested duplicate questions answers don't work for me, this isn't a duplicate question.
I'm consuming a web service. the answer I get is stored in the variable answerService, this is a very long string and after this I extract what is inside the tag span that has this structure:
<span style = "font-weight: bold"> xxx </ span>
"xxx" is what I want to extract
#with that I get the "xxx"
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)
I get an array of "n" length according to the span existing with this structure.
If I do this directly from the web service it does not work and I only get this answer:
['áGILMENTE']
Now, if I put the response of the web service sameStringOfAnswer in my code, the result is different:
print(arraySpan)
['ADV', 'áGILMENTE']
By logic the answer is the same and never changes, for some strange reason in real time when I get the response from the web service, I only get ['áGILMENTE'] when the answer I expect is ['ADV', 'áGILMENTE']
This is the key piece that shows that 2 span is always coming with the structure I need:
Here is my code:
import requests
import re
session = requests.Session()
getId=session.get('http://cartago.lllf.uam.es/grampal/grampal.cgi')
cookie=session.cookies.get_dict()
getId=session.cookies.get_dict()
getId=getId["CGISESSID"]
#getting an ID for request a webservice
getService=requests.get("http://cartago.lllf.uam.es/grampal/grampal.cgi?m=analiza&csrf="+getId+"&e="+"ágilmente", cookies=cookie)
answerService=getService.text
#get the value of the <span>
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)
print(answerService)
print("array",arraySpan)
#same code but using the result of service web
sameStringOfAnswer='<html xmlns="http://www.w3.org/TR/REC-html40"><head><title>Grampal </title><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><meta name="Content-Language" content="EN"><meta name="author" content="jmguirao#ugr.es"><link rel="icon" type="image/ico" href="/favicon.ico"/><style type="text/css">html,body,form,ul,li,h1,h3,p{margin:0; padding:0}body{font-family: Arial, Helvetica, sans-serif; background-color:#fff}a{text-decoration: none;}a:hover{text-decoration: underline}ul{list-style-type: none}td{padding: 0.5pc 2pc 0pc 0pc}.nav{float: right; padding: 0.5pc 0.5pc 0.5pc 0.5pc; margin-left:5px}.nav li{display:inline; border-left: 1px solid #444; padding:0 0.4em;}.nav li.first{border-left:0}.hide{display:none}input{text-indent: 2px}input[type="submit"]{text-indent: 0}DIV.delPage{padding: 0.5ex 5em 0.5em 5em; background-color:#ffd6ba;}.delMain{padding: 2ex 0.5em 0.5pc 0.5em;}.post{margin-bottom: 0.25pc; font-size: 100%; padding-top: 0.5ex;}.posts, #posts{padding: 0.5ex 0.5em 0.5pc 50px;}.banner{padding: 0.5ex 0 0.5pc 0.5em;background-color: #ffc6aa;clear: both}.banner h1{font-weight: bolder; font-size: 150%;margin:0; padding:0 0 0 26px; display: inline;}h2{font-weight: bolder; font-size: 140%; color: red; margin:0; padding:0 0 0 26px; display: inline;}.resaltado{font-weight: bolder;font-size: 100%}</style></head><body><div class="banner"><ul class="hide"><li>skip to content</li></ul><ul class="nav">Análsis de:<li class="first"><a title="Analizador morfosintáctico" href="/grampal/grampal.cgi?m=analiza&e=ágilmente">palabras</a></li><li><a title="Desambiguador contextual" href="/grampal/grampal.cgi?m=etiqueta&e=ágilmente">oraciones</a></li><li><a title="Etiquetado de textos" href="/grampal/grampal.cgi?m=xml">textos</a></li><li><a title="Formas de una palabra" href="/grampal/grampal.cgi?m=genera&e=ágilmente">Generación de formas</a></li><!--<li><a title="Transcripción fonética" href="/grampal/grampal.cgi?m=transcribe&e=ágilmente">Transcripción</a></li>--><li>Etiquetario</li><li>Autores</li></ul><h1>Grampal</h1></div><div class="delPage" style="font-size: 80%;"><form method="GET" action="/grampal/grampal.cgi"><input type="hidden" name="m" value="analiza"><input type="hidden" name="csrf" value="94508700a0ae409a90718299ae00b0e0"><span class="resaltado">Palabra : </span><input name="e" size="60" value="ágilmente"><input type="submit" value="Analiza"> </form></div><br><h2>ágilmente</h2><div class="delMain"><div id="posts"><table><tr><td style="font-style:italic;font-size:90%">categoría <span style="font-weight:bold"> ADV </span></td><td style="font-style:italic;font-size:90%">lema <span style="font-weight:bold"> áGILMENTE </span></td></tr></table></div></div></body></html>'
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', sameStringOfAnswer)
print(arraySpan)
What am I doing wrong?
The HTML from the webservice contains:
<span style="font-weight:bold"> ADV\n </span>
But your minified code contains the tag without the newline \n:
<span style="font-weight:bold"> ADV </span>
You can test the difference yourself:
>>> pattern = r'<span style="font-weight:bold">(.*?)<'
>>> re.findall(pattern, '<span style="font-weight:bold">AAA\n<')
[]
>>> re.findall(pattern, '<span style="font-weight:bold">AAA<')
['AAA']
That is why the are different. You should have mentioned that you use a minifier, as they alter the HTML and you can not use regex after that and still expect the same output.
This whole problem would have been avoided if you used an XML parser instead of regex, just like the linked question suggests: RegEx match open tags except XHTML self-contained tags

Python: SQL Generated Variables written to HTML file

I have a script in Python which connects to SQL using pyodbc and returns a set of values from a calendar for the 30 days following today. I prototyped it by using the print('') function to generate the HTML for the file I was creating then copying and pasting it in to an HTML file with Notepad++ and I know the HTML is sound and will be good for its purpose. However when it comes to generating the file I'm running aground with including the SQL results in the variable that is passed to the file writer.
I have tried both {variable} and %v methods which just seem to be either erroring out with;
unsupported format character ';' (0x3b) at index 1744
in the case of %, or in the case of {inset} is just including the word rather than the var. below is the code I have in JN;
from os import getenv
import pyodbc
cnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER=MYSERVER\SQLEXPRESS;DATABASE=MyTable;UID=test;PWD=t')
f = open('tes.html','w')
cursor = cnxn.cursor()
cursor.execute('DECLARE #today as date SET #today = GetDate() SELECT style112, day, month, year, dayofweek, showroom_name, isbusy from ShowroomCal where Date Between #today and dateadd(month,1,#today) ')
row = cursor.fetchone()
while row is not None:
inset = ('<div class="',row.isbusy,'">',row.day,'</div>')
row = cursor.fetchone()
html_str = """
<html lang="en" ><head><meta charset="UTF-8"><title>Calendar</title>
<link rel=\'stylesheet prefetch\' href=\'https://netdna.bootstrapcdn.com/font-awesome/3.2.1/css/font-awesome.css\'>
<style>
body{background-color: #ffffff;}
a{color:#462955; text-decoration: none; display: block;}a:hover{color:#ffffff; text-decoration: none; display: block;}#yes a {color:#ffffff !important; text-decoration: none; display: block;}#yes a:hover {color:#ffffff !important; text-decoration: none; display: block;}
#calendar{margin-left: auto;margin-right: auto;width: 800px;font-family: \'Lato\', sans-serif;}
#calendar_weekdays div{display:inline-block;vertical-align:top;}
#calendar_content, #calendar_weekdays, #calendar_header{position: relative;width: 800px;overflow: hidden;float: left;z-index: 10;}
#calendar_weekdays div, #calendar_content div{width: 25px;height: 25px;overflow: hidden;text-align: center;background-color: #FFFFFF;color: #787878;}
.Yes{background-color: #990000 !important;color: #CDCDCD !important;}
.None{background-color: #ffffff !Important;color: #462955 !important;}
.None:hover{background-color: #462955 !Important;color: #ffffff !important;}
.wend{background-color: #676767 !important;color: #999999 !important;}
#calendar_content{background-colour: #ff0000;-webkit-border-radius: 0px 0px 12px 12px;-moz-border-radius: 0px 0px 12px 12px; border-radius: 0px 0px 12px 12px;}
#calendar_content div{float: left;}
#yes {background-color: #ff0000 !important;}
#calendar_content div:hover{background-color: #F8F8F8;}
#calendar_content div.blank{background-color: #E8E8E8;}
#calendar_header, #calendar_content div.today{zoom: 1;filter: alpha(opacity=70);opacity: 0.7;}
#calendar_content div.today{color: #FFFFFF;}
#calendar_header{width: 100%;height: 25px;text-align: center;background-color: #FF6860;padding: 8px 0;-webkit-border-radius: 12px 12px 0px 0px;-moz-border-radius: 12px 12px 0px 0px; border-radius: 12px 12px 0px 0px;}
#calendar_header h1{font-size: 1.5em;color: #FFFFFF;float:left;width:70%;
i[class^=icon-chevron]{color: #FFFFFF;float: left;width:15%;border-radius: 50%;}
</style>
<link href=\'https://fonts.googleapis.com/css?family=Lato\' rel=\'stylesheet\' type=\'text/css\'>
</head><base target="_parent">
<div id="calendar"><div id="calendar_header"><h1>07 2018</h1></div><div id="calendar_weekdays"></div><div id="calendar_content">
{inset}
</div></div><script src=\'jquery.min.js\'></script>
<script>
$(function(){function c(){p();var e=h();var r=0;var u=false;l.empty();while(!u){if(s[r]==e[0].weekday){u=true}else{l.append(\'<div class="blank"></div>\');r++}}for(var c=0;c<42-r;c++){if(c>=e.length){l.append(\'<div class="blank"></div>\')}else{var v=e[c].day;var m=g(new Date(t,n-1,v))?\'<div class="today">\':"<div>";l.append(m+""+v+"</div>")}}var y=o[n-1];a.css("background-color",y).find("h1").text(i[n-1]+" "+t);f.find("div").css("color",y);l.find(".today").css("background-color",y);d()}function h(){var e=[];for(var r=1;r<v(t,n)+1;r++){e.push({day:r,weekday:s[m(t,n,r)]})}return e}function p(){f.empty();for(var e=0;e<7;e++){f.append("<div>"+s[e].substring(0,3)+"</div>")}}function d(){var t;var n=$("#calendar").css("width",e+"px");n.find(t="#calendar_weekdays, #calendar_content").css("width",e+"px").find("div").css({width:e/7+"px",height:e/14+"px","line-height":e/14+"px"});n.find("#calendar_header").css({height:e*(1/14)+"px"}).find(\'i[class^="icon-chevron"]\').css("line-height",e*(1/14)+"px")}function v(e,t){return(new Date(e,t,0)).getDate()}function m(e,t,n){return(new Date(e,t-1,n)).getDay()}function g(e){return y(new Date)==y(e)}function y(e){return e.getFullYear()+"/"+(e.getMonth()+1)+"/"+e.getDate()}function b(){var e=new Date;t=e.getFullYear();n=e.getMonth()+1}var e=700;var t=2018;var n=9;var r=[];var i=["JANUARY","FEBRUARY","MARCH","APRIL","MAY","JUNE","JULY","AUGUST","SEPTEMBER","OCTOBER","NOVEMBER","DECEMBER"];var s=["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"];var o=["#462955","#462955","#462955","#462955","#462955","#462955","#462955","#462955","#462955","#462955","#462955","#462955"];var u=$("#calendar");var a=u.find("#calendar_header");var f=u.find("#calendarweekdays");var l=u.find("#calendarcontent");b();c();a.find(\'i[class^="icon-chevron"]\').on("click",function(){var e=$(this);var r=function(e){n=e=="next"?n+1:n-1;if(n<1){n=12;t--}else if(n>12){n=1;t++}c()};if(e.attr("class").indexOf("left")!=-1){r("previous")}else{r("next")}})})
function updateValue(val, event) {document.getElementById("field17").value = val;event.preventDefault();}
</script>
</body></html><wehavechangedit>
"""
cnxn.close()
f.write(html_str)
f.close()
Can anyone point me in the direction of a better way to include the variables? Do I need to have the inset as an array for this model?
It's Py3.6, on Windows 10.
Have you tried to just save your html_str inside a template .html file, write your inset lines into a long string, then read your file into a string, do the replace, then re-write the file?
with open('C:\\template.html') as file:
wholefile = file.readlines()
use this to make a string of your results.
inset = inset + '<div class="'+ str(row.isbusy) + '">' + str(row.day) + '</div>' + '\n'
and then do the replace, so you will have the complete file in a string, then write it back out.
wholefile.replace('{inset}',inset)

Export Pandas DataFrame into a PDF file using Python

What is an efficient way to generate PDF for data frames in Pandas?
First plot table with matplotlib then generate pdf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
df = pd.DataFrame(np.random.random((10,3)), columns = ("col 1", "col 2", "col 3"))
#https://stackoverflow.com/questions/32137396/how-do-i-plot-only-a-table-in-matplotlib
fig, ax =plt.subplots(figsize=(12,4))
ax.axis('tight')
ax.axis('off')
the_table = ax.table(cellText=df.values,colLabels=df.columns,loc='center')
#https://stackoverflow.com/questions/4042192/reduce-left-and-right-margins-in-matplotlib-plot
pp = PdfPages("foo.pdf")
pp.savefig(fig, bbox_inches='tight')
pp.close()
reference:
How do I plot only a table in Matplotlib?
Reduce left and right margins in matplotlib plot
Here is how I do it from sqlite database using sqlite3, pandas and pdfkit
import pandas as pd
import pdfkit as pdf
import sqlite3
con=sqlite3.connect("baza.db")
df=pd.read_sql_query("select * from dobit", con)
df.to_html('/home/linux/izvestaj.html')
nazivFajla='/home/linux/pdfPrintOut.pdf'
pdf.from_file('/home/linux/izvestaj.html', nazivFajla)
Well one way is to use markdown. You can use df.to_html(). This converts the dataframe into a html table. From there you can put the generated html into a markdown file (.md) (see http://daringfireball.net/projects/markdown/basics). From there, there are utilities to convert markdown into a pdf (https://www.npmjs.com/package/markdown-pdf).
One all-in-one tool for this method is to use Atom text editor (https://atom.io/). There you can use an extension, search "markdown to pdf", which will make the conversion for you.
Note: When using to_html() recently I had to remove extra '\n' characters for some reason. I chose to use Atom -> Find -> '\n' -> Replace "".
Overall this should do the trick!
With reference to these two examples that I found useful:
Apply CSS class to Pandas DataFrame using to_html
https://pbpython.com/pdf-reports.html
The simple CSS code saved in same folder as ipynb:
/* includes alternating gray and white with on-hover color */
.mystyle {
font-size: 11pt;
font-family: Arial;
border-collapse: collapse;
border: 1px solid silver;
}
.mystyle td, th {
padding: 5px;
}
.mystyle tr:nth-child(even) {
background: #E0E0E0;
}
.mystyle tr:hover {
background: silver;
cursor: pointer;
}
The python code:
pdf_filepath = os.path.join(folder,file_pdf)
demo_df = pd.DataFrame(np.random.random((10,3)), columns = ("col 1", "col 2", "col 3"))
table=demo_df.to_html(classes='mystyle')
html_string = f'''
<html>
<head><title>HTML Pandas Dataframe with CSS</title></head>
<link rel="stylesheet" type="text/css" href="df_style.css"/>
<body>
{table}
</body>
</html>
'''
HTML(string=html_string).write_pdf(pdf_filepath, stylesheets=["df_style.css"])
This is a solution with an intermediate pdf file.
The table is pretty printed with some minimal css.
The pdf conversion is done with weasyprint. You need to pip install weasyprint.
# Create a pandas dataframe with demo data:
import pandas as pd
demodata_csv = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv'
df = pd.read_csv(demodata_csv)
# Pretty print the dataframe as an html table to a file
intermediate_html = '/tmp/intermediate.html'
to_html_pretty(df,intermediate_html,'Iris Data')
# if you do not want pretty printing, just use pandas:
# df.to_html(intermediate_html)
# Convert the html file to a pdf file using weasyprint
import weasyprint
out_pdf= '/tmp/demo.pdf'
weasyprint.HTML(intermediate_html).write_pdf(out_pdf)
# This is the table pretty printer used above:
def to_html_pretty(df, filename='/tmp/out.html', title=''):
'''
Write an entire dataframe to an HTML file
with nice formatting.
Thanks to #stackoverflowuser2010 for the
pretty printer see https://stackoverflow.com/a/47723330/362951
'''
ht = ''
if title != '':
ht += '<h2> %s </h2>\n' % title
ht += df.to_html(classes='wide', escape=False)
with open(filename, 'w') as f:
f.write(HTML_TEMPLATE1 + ht + HTML_TEMPLATE2)
HTML_TEMPLATE1 = '''
<html>
<head>
<style>
h2 {
text-align: center;
font-family: Helvetica, Arial, sans-serif;
}
table {
margin-left: auto;
margin-right: auto;
}
table, th, td {
border: 1px solid black;
border-collapse: collapse;
}
th, td {
padding: 5px;
text-align: center;
font-family: Helvetica, Arial, sans-serif;
font-size: 90%;
}
table tbody tr:hover {
background-color: #dddddd;
}
.wide {
width: 90%;
}
</style>
</head>
<body>
'''
HTML_TEMPLATE2 = '''
</body>
</html>
'''
Thanks to #stackoverflowuser2010 for the pretty printer, see stackoverflowuser2010's answer https://stackoverflow.com/a/47723330/362951
I did not use pdfkit, because I had some problems with it on a headless machine. But weasyprint is great.
when using Matplotlib, here's how to get a prettier table with alternating colors for the rows, etc. as well as to optionally paginate the PDF:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
def _draw_as_table(df, pagesize):
alternating_colors = [['white'] * len(df.columns), ['lightgray'] * len(df.columns)] * len(df)
alternating_colors = alternating_colors[:len(df)]
fig, ax = plt.subplots(figsize=pagesize)
ax.axis('tight')
ax.axis('off')
the_table = ax.table(cellText=df.values,
rowLabels=df.index,
colLabels=df.columns,
rowColours=['lightblue']*len(df),
colColours=['lightblue']*len(df.columns),
cellColours=alternating_colors,
loc='center')
return fig
def dataframe_to_pdf(df, filename, numpages=(1, 1), pagesize=(11, 8.5)):
with PdfPages(filename) as pdf:
nh, nv = numpages
rows_per_page = len(df) // nh
cols_per_page = len(df.columns) // nv
for i in range(0, nh):
for j in range(0, nv):
page = df.iloc[(i*rows_per_page):min((i+1)*rows_per_page, len(df)),
(j*cols_per_page):min((j+1)*cols_per_page, len(df.columns))]
fig = _draw_as_table(page, pagesize)
if nh > 1 or nv > 1:
# Add a part/page number at bottom-center of page
fig.text(0.5, 0.5/pagesize[0],
"Part-{}x{}: Page-{}".format(i+1, j+1, i*nv + j + 1),
ha='center', fontsize=8)
pdf.savefig(fig, bbox_inches='tight')
plt.close()
Use it as follows:
dataframe_to_pdf(df, 'test_1.pdf')
dataframe_to_pdf(df, 'test_6.pdf', numpages=(3, 2))
Explanation of the code is here:
https://levelup.gitconnected.com/how-to-write-a-pandas-dataframe-as-a-pdf-5cdf7d525488

Categories