Print specific sheets in excel doc to PDF with Python (xlwings) - python

I am attempting to automate the very manual process of individually selecting a range of worksheets within an excel file to PDF. I was able to string together the following code, which successfully prints the document. However, I cannot figure out how to select specific worksheets within my workbook, so it currently prints the entire workbook to PDF (which comes out to a whopping 897 pages).
Any ideas on how to select certain pages and then print to PDF with a given file name?
import os
import xlwings as xw
book = xw.Book(r'linktomyfile.xlsm')
sheet = book.sheets[0]
current_work_dir = os.getcwd()
pdf_path = os.path.join(current_work_dir, "Report_Date.pdf")
print(f"Saving workbook as '{pdf_path}' ...")
book.api.ExportAsFixedFormat(0, pdf_path)
print(f"Opening PDF file with default application")
Much appreciated!

You can just use the sheet reference to print to pdf, for example:
book = xw.Book(r'linktomyfile.xlsm')
sheet = book.sheets("Sheet1")
current_work_dir = os.getcwd()
pdf_path = os.path.join(current_work_dir, "Report_Date.pdf")
sheet.api.ExportAsFixedFormat(0, pdf_path)
You can also specify a range, e.g.
sheet.range("A1:G15").api.ExportAsFixedFormat(0, pdf_path)
Example of looping through specific sheets:
sheetlist = ["Sheet A", "Sheet B"]
for each in sheetlist:
pdf_path = os.path.join(current_work_dir, f"{each}.pdf")
sht = book.sheets(each)
sht.api.ExportAsFixedFormat(0, pdf_path)
Here each pdf is named after the sheet name.

In newer version of xlwings, there's a built in .to_pdf() function. Assuming you've got a book or sheet ready to print:
# to print a whole workbook
myXlwingsWorkBook.to_pdf(r"c:\myOutputPath")
# print a sheet
myXlwingsSheet.to_pdf(r"c:\myOutputPath")
Documentation: Xlwings documentation - then search for "PDF"
There're a few options. I wish I could just print/pdf the first page though...

Related

How to make an URL 'clickable' in Excel with Python? [duplicate]

I am using win32com to modify an Excel spreadsheet (Both read and edit at the same time) I know there are other modules out there that can do one or the other but for the application I am doing I need it read and processed at the same time.
The final step is to create some hyperlinks off of a path name. Here is an Example of what I have so far:
import win32com.client
excel = r'I:\Custom_Scripts\Personal\Hyperlinks\HyperlinkTest.xlsx'
xlApp = win32com.client.Dispatch("Excel.Application")
workbook = xlApp.Workbooks.Open(excel)
worksheet = workbook.Worksheets("Sheet1")
for xlRow in xrange(1, 10, 1):
a = worksheet.Range("A%s"%(xlRow)).Value
if a == None:
break
print a
workbook.Close()
I found some code for reading Hyperlinks using win32com:
sheet.Range("A8").Hyperlinks.Item(1).Address
but not how to set hyperlinks
Can someone assist me?
Borrowing heavily from this question, as I couldn't find anything on SO to link to as a duplicate...
This code will create a Hyperlink in cells A1:A9
import win32com.client
excel = r'I:\Custom_Scripts\Personal\Hyperlinks\HyperlinkTest.xlsx'
xlApp = win32com.client.Dispatch("Excel.Application")
workbook = xlApp.Workbooks.Open(excel)
worksheet = workbook.Worksheets("Sheet1")
for xlRow in xrange(1, 10, 1):
worksheet.Hyperlinks.Add(Anchor = worksheet.Range('A{}'.format(xlRow)),
Address="http://www.microsoft.com",
ScreenTip="Microsoft Web Site",
TextToDisplay="Microsoft")
workbook.Save()
workbook.Close()
And here is a link to the Microsoft Documentation for the Hyperlinks.Add() method.

How do I download an xlsm file and read every sheet in python?

Right now I am doing the following.
import xlrd
resp = requests.get(url, auth=auth).content
output = open(r'temp.xlsx', 'wb')
output.write(resp)
output.close()
xl = xlrd.open_workbook(r'temp.xlsx')
sh = 1
try:
for sheet in xl.sheets():
xls.append(sheet.name)
except:
xls = ['']
It's extracting the sheets but I don't know how to read the file or if saving the file as an .xlsx is actually working for macros. All I know is that the code is not working right now and I need to be able to catch the data that is being generated in a macro. Please help! Thanks.
I highly recommend using xlwings if you want to open, modify, and save .xlsm files without corrupting them. I have tried a ton of different methods (using other modules like openpyxl) and the macros always end up being corrupted.
import xlwings as xw
app = xw.App(visible=False) # IF YOU WANT EXCEL TO RUN IN BACKGROUND
xlwb = xw.Book('PATH\\TO\\FILE.xlsm')
xlws = {}
xlws['ws1'] = xlwb.sheets['Your Worksheet']
print(xlws['ws1'].range('B1').value) # get value
xlws['ws1'].range('B1').value = 'New Value' # change value
yourMacro = xlwb.macro('YourExcelMacro')
yourMacro()
xlwb.save()
xlwb.close()
Edit - I added an option to keep Excel invisible at users request

Python win32 client saving %20 instead of spaces

I have an issue saving the pdf files. The code works to convert excel files to pdf, but it is saving all of my files with %20 instead of spaces. So "Fort Worth" would save as "Fort%20Worth".
Here is the code below. Thanks.
import xlwings as xw
import win32com.client
curyq = "2017Q4"
msa_list_ea = ['Albuquerque','Atlanta','Austin','Baltimore','Boston','Charlotte','Chicago','Cincinnati','Cleveland','Columbus',
'Dallas','Dallas/Ft. Worth','Denver','Detroit','Fort Lauderdale','Fort Worth','Hartford','Houston','Indianapolis',
'Jacksonville','Kansas City','Las Vegas','Long Island','Los Angeles','Louisville','Memphis','Miami','Milwaukee',
'Minneapolis','Nashville','New York','Norfolk','Newark','Oakland','Orange County','Orlando','West Palm Beach',
'Philadelphia','Phoenix','Pittsburgh','Portland','Raleigh','Richmond','Sacramento','Salt Lake City','San Antonio',
'Riverside','San Diego','San Francisco','San Jose','Seattle','St. Louis','Tampa','Tucson','Ventura','Washington, DC']
## convert market excel EBA reports to PDF
o = win32com.client.Dispatch("Excel.Application")
o.Visible = False
for i in msa_list_ea:
if i == "Dallas/Ft. Worth":
i = "Dallas-Ft. Worth"
if i == "Newark":
i = "Northern New Jersey"
wb_path = r'G:/Team/EBAs/{}/Excel/{}_EBA_{}.xlsx'.format(curyq, i, curyq)
wb = o.Workbooks.Open(wb_path)
ws_index_list = [1] #chooses which sheet in workbook to print (counting begins at 1)
path_to_pdf = r'G:/Team/EBAs/{}/PDF/{}_EBA_{}.pdf'.format(curyq, i, curyq) ## path to save pdf file
wb.WorkSheets(ws_index_list).Select()
wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf)
wb.Close(False)
print("{}".format(i))
This prints correctly in my terminal, no %20s here.
I assume your raw string literal is not respected by the external call to Excel in wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf).
Try adding quotes:
path_to_pdf = r'"G:/Team/EBAs/{}/PDF/{}_EBA_{}.pdf"'.format(curyq, i, curyq) ## path to save pdf file
I had the same issue when running a similar code on a Windows machine. The path was using forward slashes. Using double backslashes solved the problem.
To make it non OS specific I used the os and pathlib modules to format the path correctly:
path_to_pdf = os.fspath(Path(path_to_pdf))

Python - save different sheets of an excel file as individual excel files

Newbie : I have an Excel file, which has more than 100 different Sheets. Each sheet contains several tables and charts.
I wish to save every sheet as a new Excel file.
I tried many python codes, but none of them worked.
Kindly help in this. Thanks!
Edit 1 : In reponse to comments, this is what I tried:
import pandas as pd
import xlrd
inputFile = 'D:\Excel\Complete_data.xlsx'
#getting sheet names
xls = xlrd.open_workbook(inputFile, on_demand=True)
sheet_names = xls.sheet_names()
path = "D:/Excel/All Files/"
#create a new excel file for every sheet
for name in sheet_names:
parsing = pd.ExcelFile(inputFile).parse(sheetname = name)
#writing data to the new excel file
parsing.to_excel(path+str(name)+".xlsx", index=False)
To be precise, the problem is coming in copying tables and charts.
I have just worked through this issue so will post my solution, I do not know how it will affect charts etc.
import os
import xlrd
from xlutils.copy import copy
import xlwt
path = #place path where files to split up are
targetdir = (path + "New_Files/") #where you want your new files
if not os.path.exists(targetdir): #makes your new directory
os.makedirs(targetdir)
for root,dir,files in os.walk(path, topdown=False): #all the files you want to split
xlsfiles=[f for f in files] #can add selection condition here
for f in xlsfiles:
wb = xlrd.open_workbook(os.path.join(root, f), on_demand=True)
for sheet in wb.sheets(): #cycles through each sheet in each workbook
newwb = copy(wb) #makes a temp copy of that book
newwb._Workbook__worksheets = [ worksheet for worksheet in newwb._Workbook__worksheets if worksheet.name == sheet.name ]
#brute force, but strips away all other sheets apart from the sheet being looked at
newwb.save(targetdir + f.strip(".xls") + sheet.name + ".xls")
#saves each sheet as the original file name plus the sheet name
Not particularly elegant but worked well for me and gives easy functionality. Hopefully useful for someone.

Saving excel work book not working in python

for sheet_name in book.sheet_names():
for index in range(len(tabs)):
tab = tabs[index]
if sheet_name == tab:
dump_file_name = dump_files[index]
dump_file_name = file_prefix+dump_file_name
sheet = book.sheet_by_name(sheet_name)
new_book = Workbook()
sheet1 = new_book.add_sheet("Sheet 1")
for row in range(sheet.nrows):
values = []
for col in range(sheet.ncols):
sheet1.write(row,col,sheet.cell(row,col).value)
xlsx_file_name = dirname+"/"+dump_file_name+".xlsx"
sheet1.title = xlsx_file_name
new_book.save(xlsx_file_name)
The file is creating and data is there, but if I open it in openoffice.org and click the save button it asks for new name.
The file can not be read by PHP also. Again if I open and save it with new name then it works perfectly. I think we have to add something in the code so that it could be used by PHP.
i did google and found the solution here
http://xlsxwriter.readthedocs.org/getting_started.html
This is exactly what i wanted.
Creating and saving files to xlsx format.
Now its working perfectly.
original source
How to save Xlsxwriter file in certain path?
important link:
https://pypi.python.org/pypi/PyExcelerate

Categories