Python win32 client saving %20 instead of spaces - python

I have an issue saving the pdf files. The code works to convert excel files to pdf, but it is saving all of my files with %20 instead of spaces. So "Fort Worth" would save as "Fort%20Worth".
Here is the code below. Thanks.
import xlwings as xw
import win32com.client
curyq = "2017Q4"
msa_list_ea = ['Albuquerque','Atlanta','Austin','Baltimore','Boston','Charlotte','Chicago','Cincinnati','Cleveland','Columbus',
'Dallas','Dallas/Ft. Worth','Denver','Detroit','Fort Lauderdale','Fort Worth','Hartford','Houston','Indianapolis',
'Jacksonville','Kansas City','Las Vegas','Long Island','Los Angeles','Louisville','Memphis','Miami','Milwaukee',
'Minneapolis','Nashville','New York','Norfolk','Newark','Oakland','Orange County','Orlando','West Palm Beach',
'Philadelphia','Phoenix','Pittsburgh','Portland','Raleigh','Richmond','Sacramento','Salt Lake City','San Antonio',
'Riverside','San Diego','San Francisco','San Jose','Seattle','St. Louis','Tampa','Tucson','Ventura','Washington, DC']
## convert market excel EBA reports to PDF
o = win32com.client.Dispatch("Excel.Application")
o.Visible = False
for i in msa_list_ea:
if i == "Dallas/Ft. Worth":
i = "Dallas-Ft. Worth"
if i == "Newark":
i = "Northern New Jersey"
wb_path = r'G:/Team/EBAs/{}/Excel/{}_EBA_{}.xlsx'.format(curyq, i, curyq)
wb = o.Workbooks.Open(wb_path)
ws_index_list = [1] #chooses which sheet in workbook to print (counting begins at 1)
path_to_pdf = r'G:/Team/EBAs/{}/PDF/{}_EBA_{}.pdf'.format(curyq, i, curyq) ## path to save pdf file
wb.WorkSheets(ws_index_list).Select()
wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf)
wb.Close(False)
print("{}".format(i))

This prints correctly in my terminal, no %20s here.
I assume your raw string literal is not respected by the external call to Excel in wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf).
Try adding quotes:
path_to_pdf = r'"G:/Team/EBAs/{}/PDF/{}_EBA_{}.pdf"'.format(curyq, i, curyq) ## path to save pdf file

I had the same issue when running a similar code on a Windows machine. The path was using forward slashes. Using double backslashes solved the problem.
To make it non OS specific I used the os and pathlib modules to format the path correctly:
path_to_pdf = os.fspath(Path(path_to_pdf))

Related

Print specific sheets in excel doc to PDF with Python (xlwings)

I am attempting to automate the very manual process of individually selecting a range of worksheets within an excel file to PDF. I was able to string together the following code, which successfully prints the document. However, I cannot figure out how to select specific worksheets within my workbook, so it currently prints the entire workbook to PDF (which comes out to a whopping 897 pages).
Any ideas on how to select certain pages and then print to PDF with a given file name?
import os
import xlwings as xw
book = xw.Book(r'linktomyfile.xlsm')
sheet = book.sheets[0]
current_work_dir = os.getcwd()
pdf_path = os.path.join(current_work_dir, "Report_Date.pdf")
print(f"Saving workbook as '{pdf_path}' ...")
book.api.ExportAsFixedFormat(0, pdf_path)
print(f"Opening PDF file with default application")
Much appreciated!
You can just use the sheet reference to print to pdf, for example:
book = xw.Book(r'linktomyfile.xlsm')
sheet = book.sheets("Sheet1")
current_work_dir = os.getcwd()
pdf_path = os.path.join(current_work_dir, "Report_Date.pdf")
sheet.api.ExportAsFixedFormat(0, pdf_path)
You can also specify a range, e.g.
sheet.range("A1:G15").api.ExportAsFixedFormat(0, pdf_path)
Example of looping through specific sheets:
sheetlist = ["Sheet A", "Sheet B"]
for each in sheetlist:
pdf_path = os.path.join(current_work_dir, f"{each}.pdf")
sht = book.sheets(each)
sht.api.ExportAsFixedFormat(0, pdf_path)
Here each pdf is named after the sheet name.
In newer version of xlwings, there's a built in .to_pdf() function. Assuming you've got a book or sheet ready to print:
# to print a whole workbook
myXlwingsWorkBook.to_pdf(r"c:\myOutputPath")
# print a sheet
myXlwingsSheet.to_pdf(r"c:\myOutputPath")
Documentation: Xlwings documentation - then search for "PDF"
There're a few options. I wish I could just print/pdf the first page though...

How do I download an xlsm file and read every sheet in python?

Right now I am doing the following.
import xlrd
resp = requests.get(url, auth=auth).content
output = open(r'temp.xlsx', 'wb')
output.write(resp)
output.close()
xl = xlrd.open_workbook(r'temp.xlsx')
sh = 1
try:
for sheet in xl.sheets():
xls.append(sheet.name)
except:
xls = ['']
It's extracting the sheets but I don't know how to read the file or if saving the file as an .xlsx is actually working for macros. All I know is that the code is not working right now and I need to be able to catch the data that is being generated in a macro. Please help! Thanks.
I highly recommend using xlwings if you want to open, modify, and save .xlsm files without corrupting them. I have tried a ton of different methods (using other modules like openpyxl) and the macros always end up being corrupted.
import xlwings as xw
app = xw.App(visible=False) # IF YOU WANT EXCEL TO RUN IN BACKGROUND
xlwb = xw.Book('PATH\\TO\\FILE.xlsm')
xlws = {}
xlws['ws1'] = xlwb.sheets['Your Worksheet']
print(xlws['ws1'].range('B1').value) # get value
xlws['ws1'].range('B1').value = 'New Value' # change value
yourMacro = xlwb.macro('YourExcelMacro')
yourMacro()
xlwb.save()
xlwb.close()
Edit - I added an option to keep Excel invisible at users request

Can excel CSV files be injected with macros?

I am trying to inject my csv files with a macro that automatically fits the column widths in excel but nothing is happening. I am using a code from a previous post (Use Python to Inject Macros into Spreadsheets). But here is the code
import os
import sys
# Import System libraries
import glob
import random
import re
#sys.coinit_flags = 0 # comtypes.COINIT_MULTITHREADED
# USE COMTYPES OR WIN32COM
#import comtypes
#from comtypes.client import CreateObject
# USE COMTYPES OR WIN32COM
import win32com
from win32com.client import Dispatch
desktop = os.path.join(os.path.join(os.environ['USERPROFILE']), 'Desktop')
x = r'C:\This\is\the\path'
scripts_dir = x
conv_scripts_dir = x
strcode = \
'''
sub test()
Column.Autofit
end sub
'''
#com_instance = CreateObject("Excel.Application", dynamic = True) # USING COMTYPES
com_instance = Dispatch("Excel.Application") # USING WIN32COM
com_instance.Visible = True#False
com_instance.DisplayAlerts = True#False
for script_file in glob.glob(os.path.join(scripts_dir, '*.csv')):
print("Processing: %s" % scr ipt_file)
# do the operation in background without actually opening Excel
(file_path, file_name) = os.path.split(script_file)
objworkbook = com_instance.Workbooks.Open(script_file)
xlmodule = objworkbook.VBProject.VBComponents.Add(1)
xlmodule.CodeModule.AddFromString(strcode.strip())
objworkbook.SaveAs(os.path.join(conv_scripts_dir, file_name))
com_instance.Quit()
This code actually opens the a file in excel, executes a macro, and then closes the excel window. Why doesn't the macro work from the python command line, but does work inside excel?
Exactly. There is not formats, whatsoever, in a CSV. It's like trying to add formatting to a Text file. You can't do that. You can convert CSV files to XLSM files (Excel Macro) or XLSB (Binary).
Sub CSVtoXLSB2()
Dim wb As Workbook
Dim CSVPath As String
Dim sProcessFile As String
CSVPath = "C:\your_path_here\"
sProcessFile = Dir(CSVPath & "*.csv")
Do Until sProcessFile = "" ' Loop until no file found.
Set wb = Application.Workbooks.Open(CSVPath & sProcessFile)
wb.SaveAs CSVPath & Split(wb.Name, ".")(0) & ".xlsb", FileFormat _
:=50, Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
CreateBackup:=False
wb.Close
sProcessFile = Dir() ' Get next entry.
Loop
Set wb = Nothing
End Sub
Now, if you want to copy a Module from one Workbook to another, follow the steps listed below:
Copying a module from one workbook to another
Open both the workbook that contains the macro you want to copy, and the workbook where you want to copy it.
On the Developer tab, click Visual Basic to open the Visual Basic Editor.
In the Visual Basic Editor, on the View menu, click Project Explorer Project Explorer button image, or press CTRL+R.
In the Project Explorer pane, drag the module containing the macro you want to copy to the destination workbook. In this case, we're copying Module1 from Book2.xlsm to Book1.xlsm.
VBA Project Explorer
Module1 copied from Book2.xlsm
Copy of Module1 copied to Book1.xlsm

How to disable/autoanswer dialog box about macros/VB Project when resaving xls to xlsx with win32com.client?

When using this code:
import win32com.client as win32
input_files = os.listdir(parent_dir)
input_files = [parent_dir + i for i in input_files if i.endswith('.xls') and not i.endswith('.xlsx')]
for input_file in input_files:
if not os.path.isfile(input_file.replace('.xls', '.xlsx')):
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(input_file)
wb.SaveAs(input_file + "x", FileFormat=51) # FileFormat = 51 is for .xlsx extension
wb.Close() # FileFormat = 56 is for .xls extension
excel.Application.Quit()
on excel files containing some macros/VB Project often messagebox shows up with warning, that all macros/VB Project will be lost, I would like to somehow automatically answer it, for example, "Yes", or maybe there is some parameter for SaveAs function or settings for
win32.gencache.EnsureDispatch('Excel.Application')?
Now I can just resave files as xlsm with FileFormat=51, but by some security reasons I don't want to do this, I really don't need this macros/VB Projects in my files.
Tried excel.DisplayAlerts = False - not helped.
Also thinking about something like pywinauto, but maybe it overkill and maybe there is more elegant solution?
Using
wb.Close(True) #if you need to save .xls file also
or using
wb.Close(False) # if you not decide to save .xls file
On the other hand, it may have other file opened, so when you use excel.Application.Quit() and those file does not save, excel will show confirmation dialog before closing.
Well, interesting thing, if you save in xlsm, not in xlsx, Excel would not ask questions about containing macros/VB Project, and you can open xlsm in openpyxl like it xlsm, so, how to resave in xlsm:
def resave_xls_file_as_xlsx_or_xlsm(in_xls_file_path, out_excel_file_type='xlsm'):
excel = win32.gencache.EnsureDispatch('Excel.Application')
wbxls = excel.Workbooks.Open(in_xls_file_path)
in_xls_file_path = in_xls_file_path.replace('/', '\\')
out_xlsx_or_xlsm_file_path = in_xls_file_path + out_excel_file_type[-1]
# FileFormat = 51 is for .xlsx extension, 52 for xlsm, no questions about containing VB script for xlsm
if out_excel_file_type == 'xlsm':
excel_file_format = 52
elif out_excel_file_type == 'xlsx':
excel_file_format = 51
else:
excel_file_format = 52 # or do some error corrections:
# print('ERROR, wrong excel file type:', out_excel_file_type)
# return None # sys.exit ('taihen taihen') # raise cthulhu
wbxls.SaveAs(out_xlsx_or_xlsm_file_path, FileFormat=excel_file_format)
wbxls.Close()
excel.Application.Quit()
return out_xlsx_or_xlsm_file_path
Also, sometime we have some sort of corrupted xlsx-file, and Excel start crying about it, and script stops, to autorecover it you can use this code, pay attention to xlsx files path, for example, this path won't work in this case:
resaved_xlsx_on_disk = 'c:/this_wont_work/2.xlsx' # usually works, but not with win32.client
corrupted_xlsx_on_disk = 'c:\\fyi_if_you_dont_use_two_backslashes_and_use_slashes_in_path_it_will_not_open\\1.xlsx'
resaved_xlsx_on_disk = r'c:\you_can_also_use_this\2.xlsx'
xl = win32.gencache.EnsureDispatch('Excel.Application')
# xl.Visible = True # otherwise excel is hidden (btw if corrupted it anyway show some message, but autoclose it)
wb = xl.Workbooks.Open(corrupted_xlsx_on_disk, CorruptLoad=1)
xl.SendKeys("{Enter}", Wait=1)
xl.DisplayAlerts = 0
wb.SaveAs(resaved_xlsx_on_disk) # You can try wb.Save(), but I can't get it to work :-(
wb.Close()
xl.Application.Quit()
===
upd. on 9 aug 2021:
Considering the code above, when using opposite operation - resaving xlsx to xls, often windows about Compatibility Check occures (also, if Excel just looks like hang up - that maybe because this window did not get focus and stays hidden between other windows), tl;dr, excel.DisplayAlerts = False do the job, code example (I will also check, maybe it applicable for xls->xlsx transformation without using this xlsm extension, if xls contains macroses):
import os
import win32com.client
import win32com
# if strange errors on start occures, delete everything from this path
# (maybe dynamic loading from now do this action unnecessary):
print(win32com.__gen_path__)
def resave_xlsx_files_as_xls(files_parent_dir):
input_files = os.listdir(files_parent_dir) # todo: use the glob, Luke!
input_files = [files_parent_dir + i for i in input_files if i.endswith('.xlsx')]
for input_file in input_files:
print(input_file)
if not os.path.isfile(input_file.replace('.xlsx', '.xls')):
excel = win32com.client.dynamic.Dispatch('Excel.Application')
wbxls = excel.Workbooks.Open(input_file)
# wbxls.DoNotPromptForConvert = True # seems that this line has no effect if uncommented
# wbxls.CheckCompatibility = False # seems that this line also has no effect if uncommented
excel.DisplayAlerts = False # this line do the main job - compatibility check window did not shows up!
# AFAIK this library hardly understand normal os.path.join or os.path.sep
input_file = input_file.replace('/', '\\')
input_file = input_file.replace('.xlsx', '.xls')
# FileFormat = 51 is for .xlsx extension, 52 for xlsm
excel_file_format = 56
# Compatibility check window will show up here on SaveAs line, you may did not locate it at first, cause it
# usually don't get focus, so it may be hidden somewhere under couple of windows, best
# way to find it - click "-" button on every window until wallpaper shows up, pressing
# hotkeys like Win-D will not help - this command will hide this compatibility check
# window also
wbxls.SaveAs(input_file, FileFormat=excel_file_format)
wbxls.Close() # FileFormat = 56 is for .xls extension
excel.Application.Quit()
else:
print('There is already converted file in folder!')
output_files = os.listdir(files_parent_dir)
output_files = [files_parent_dir + i for i in output_files if i.endswith('.xls')]
return output_files
resave_xlsx_files_as_xls('c:/test/')

Python parsing XLS with images [duplicate]

I found some Python2 code to extract images from Excel files.
I have a very fundamental question: Where shall I specify the path of my target excel file?
Or does it only work with an active opened Excel file?
import win32com.client # Need pywin32 from pip
from PIL import ImageGrab # Need PIL as well
import os
excel = win32com.client.Dispatch("Excel.Application")
workbook = excel.ActiveWorkbook
wb_folder = workbook.Path
wb_name = workbook.Name
wb_path = os.path.join(wb_folder, wb_name)
#print "Extracting images from %s" % wb_path
print("Extracting images from", wb_path)
image_no = 0
for sheet in workbook.Worksheets:
for n, shape in enumerate(sheet.Shapes):
if shape.Name.startswith("Picture"):
# Some debug output for console
image_no += 1
print("---- Image No. %07i ----", image_no)
# Sequence number the pictures, if there's more than one
num = "" if n == 0 else "_%03i" % n
filename = sheet.Name + num + ".jpg"
file_path = os.path.join (wb_folder, filename)
#print "Saving as %s" % file_path # Debug output
print('Saving as ', file_path)
shape.Copy() # Copies from Excel to Windows clipboard
# Use PIL (python imaging library) to save from Windows clipboard
# to a file
image = ImageGrab.grabclipboard()
image.save(file_path,'jpeg')
You can grab images from existing Excel file like this:
from PIL import ImageGrab
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
workbook = excel.Workbooks.Open(r'C:\Users\file.xlsx')
for sheet in workbook.Worksheets:
for i, shape in enumerate(sheet.Shapes):
if shape.Name.startswith('Picture'): # or try 'Image'
shape.Copy()
image = ImageGrab.grabclipboard()
image.save('{}.jpg'.format(i+1), 'jpeg')
An xlsx file is actually a zip file. You can directly get the images from the xl/media subfolder. You can do this in python using the ZipFile class. You don't need to have MS Excel or even run in Windows!
Filepath and filename is defined in the variables here:
wb_folder = workbook.Path
wb_name = workbook.Name
wb_path = os.path.join(wb_folder, wb_name)
In this particular case, it calls the active workbook at the line prior:
workbook = excel.ActiveWorkbook
But you should theoretically be able to specify path using the wb_folder and wb_name variables, as long as you load the file on the excel module (Python: Open Excel Workbook using Win32 COM Api).

Categories