I am trying to write a matrix prepared with NumPy to an EXCEL file. I need to specify a range of cells in which the matrix must be written.
I need to write the matrix to cells A4:Z512 in sheet 4 of the EXCEL file.
Now, the standard EXCEL file has 3 sheets, so I need to first add a 4th sheet and then write the matrix to it.
Is there a way to do this in python 2.7? Is it possible to do this with pure NumPy or not?
I have not used NumPy, so I am not sure if you can manipulate excel files. But for working on excel files I recommend using the win32com library. Below is some code I have used in the past to make win32com API easier to use. Feel free to use the code yourself. Hope this helps!
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
def openExcel(makeExcelVisible=True):
excel.Visible = makeExcelVisible
def closeExcel():
excel.Application.Quit()
class ExcelFile(object):
# opens up a workbook to work on, not selecting a file name creates a new one
def __init__(self, fileName=None):
if fileName == None:
self.wb = excel.Workbooks.Add()
else:
self.wb = excel.Workbooks.Open(fileName)
self.ws = None
def addWorksheet(self):
# adds a new worksheet to the workbook and makes it the current worksheet
self.ws = self.wb.Worksheets.Add()
def selectWorksheet(self, worksheetName):
# selects a worksheet to work on
self.ws = self.wb.Worksheets(worksheetName)
def renameWorksheet(self, worksheetName):
# renames current worksheet
self.ws.Name = worksheetName
def save(self):
# saves the workbook
self.wb.Save()
def saveAs(self, fileName):
# saves the workbook with the file name
self.wb.SaveAs(fileName)
def close(self):
# closes the workbook
self.wb.Close()
def insertIntoCell(self, cellRow, cellCol, data):
self.ws.Cells(cellRow,cellCol).Value = data
def clearCell(self, cellRow, cellCol):
self.ws.Cells(cellRow,cellCol).Value = None
Here is an example of how to use the code above. It creates a new excel file, renames the worksheets, adds information into the first cell on each worksheet and saves the file as "test.xlsx". Default save location is your home directory.
worksheets = ["Servers", "Printers", "Drives", "IT Phones"]
information = ["Server WS", "Printer WS", "Driver WS", "IT Phone WS"]
def changeCells(information):
excelFile = ExcelFile()
for index in xrange(len(worksheets)):
sheetNumber = index + 1
if sheetNumber == 4:
excelFile.addWorksheet()
excelFile.selectWorksheet("Sheet%d" % sheetNumber)
excelFile.renameWorksheet(worksheets[index])
excelFile.insertIntoCell(1,1,information[index])
excelFile.saveAs("test.xlsx")
excelFile.close()
openExcel()
changeCells(information)
closeExcel()
Also, I would recommend looking at the API for win32com yourself. It's a very nice and useful library.
I put together the actual code you would need for entering your matrix on Sheet4 to A4:Z512.
def addMatrix(matrix):
# use ExcelFile("fileName.xlsx") if you need to add to a specific file
excelFile = ExcelFile()
excelFile.addWorksheet()
# default excel files only have 3 sheets so had to add one
excelFile.selectWorksheet("Sheet4")
# xrange(4,513) since end index is exclusive
for row in xrange(4,513):
# 1 for A, 26 for Z
for col in xrange(1,27):
mRow = row - 4
mCol = col - 1
excelFile.insertIntoCell(row, col, matrix[mRow][mCol])
excelFile.saveAs("test.xlsx")
excelFile.close()
matrix = list()
for row in xrange(509):
matrix.append([])
for col in xrange(26):
matrix[row].append(0)
# the matrix is now filled with zeros
# use openExcel(False) to run faster, it won't open a window while running
openExcel()
addMatrix(matrix)
closeExcel()
Related
I am trying to update an excel sheet using openpyxl. When reading a updated formula based cell I am getting None output. The updates are not getting saved even though I have used openpyxl save command.
import openpyxl
# data_only=False to upadate excel file
def write_cell(data_only):
wb_obj = openpyxl.load_workbook("mydata.xlsx", data_only=data_only)
sheet_obj = wb_obj["Sheet1"]
sheet_obj = wb_obj.active
sheet_obj.cell(row = 1, column = 1).value = 8
wb_obj.save(filename="mydata.xlsx")
# data_only=True to read excel file"
def read_cell(data_only):
wb_obj = openpyxl.load_workbook("mydata.xlsx", data_only=data_only)
sheet = wb_obj["Sheet1"]
# Formula at column 2 : =A1*5
val = sheet.cell(row = 1, column = 2).value
return val
write_cell(False)
print(read_cell(True))
Actual Output -> None
Expected output -> 40
There are two solutions to this:
If you refer the documentation, it is mentioned that you can either have the formula or the value from formula. If you modify a file with formulae then you must pass it through some kind of application such as Excel and save it again which will now update the value of the formula. You won't get the none as the output now if you try to read the value of the cell containing formula.
Another solution is to open the excel file and save it from the script itself after saving it using openpyxl:
from win32com.client import Dispatch
import openpyxl
def write_cell(data_only):
wb_obj = openpyxl.load_workbook("mydata.xlsx", data_only=data_only)
sheet_obj = wb_obj["Sheet1"]
sheet_obj = wb_obj.active
sheet_obj.cell(row = 1, column = 1).value = 8
wb_obj.save(filename="mydata.xlsx")
open_save("mydata.xlsx")
def open_save(filename):
"""Function to open and save the excel file"""
xlApp = Dispatch("Excel.Application")
xlApp.Visible = False
xlBook = xlApp.Workbooks.Open(filename)
xlBook.Save()
xlBook.Close()
I am quite new to python and currently writing a code to speed up a VBA process which takes 5 to 6 hours to complete and want to speed it up. The code needs to open a password protected excel, extract certain sheet and cell data to a master sheet and if column A is that same number then override so no duplicates:
Process:
Step 1: Open password protected xls
step 2: check for the duplicated number in column A and if the same value exists then override, copy required cells from each sheet to master wb and data sheet as shown below
step 3: go back to step one until all xls are done.
This is part of the VBA to show the process to a degree:
wbThis.Worksheets("Data").Range("A" & Store_Row_no) = NewNumber
wbThis.Worksheets("Data").Range("B" & Store_Row_no) = DateNew
wbThis.Worksheets("Data").Range("C" & Store_Row_no) = wbNew.Worksheets("Sheet1").Range("F2").Value
wbThis.Worksheets("Data").Range("D" & Store_Row_no) = wbNew.Worksheets("Sheet2").Range("H152").Value
wbThis.Worksheets("Data").Range("E" & Store_Row_no) = wbNew.Worksheets("Sheet3").Range("D3").Value
and this is my current code but cant work out how I open a password protected excel and copy to master sheet and then overide for data column A if it is a duplicate.
Python code so far:
import win32com.client
import sys
import os
foldername = ('C:\\Users\\')
password = 'ORANGE
pmaster = (r'C:\Users')
xlApp = win32com.client.Dispatch("Excel.Application")
xlApp.Visible = False
master = xlApp.Workbooks.Open(Filename=pmaster)
wb = xlApp.Workbooks.Open(foldername, False, True, None, password)
sh1 = wb.Sheets('sheet1') #sheet name1
sh2 = wb.Sheets('sheet2') #sheet name2
sh3 = wb.Sheets('sheet3') #sheet name2
out1 = sh1.Range("B2").value
out2 = sh1.Range("D2").value
out3 = sh1.Range("F2").value
out4 = sh2.Range("H152").value
out5 = sh3.Range("D3").value
print(out1,out2,out3,out4,out5)
Just need to loop through help and copy to new master wb
Thank you so much in advance
All I want to do is copy a worksheet from an excel workbook to another excel workbook in Python.
I want to maintain all formatting (coloured cells, tables etc.)
I have a number of excel files and I want to copy the first sheet from all of them into one workbook. I also want to be able to update the main workbook if changes are made to any of the individual workbooks.
It's a code block that will run every few hours and update the master spreadsheet.
I've tried pandas, but it doesn't maintain formatting and tables.
I've tried openpyxl to no avail
I thought xlwings code below would work:
import xlwings as xw
wb = xw.Book('individual_files\\file1.xlsx')
sht = wb.sheets[0]
new_wb = xw.Book('Master Spreadsheet.xlsx')
new_wb.sheets["Sheet1"] = sht
But I just get the error:
----> 4 new_wb.sheets["Sheet1"] = sht
AttributeError: __setitem__
"file1.xlsx" above is an example first excel file.
"Master Spreadsheet.xlsx" is my master spreadsheet with all individual files.
In the end I did this:
def copyExcelSheet(sheetName):
read_from = load_workbook(item)
#open(destination, 'wb').write(open(source, 'rb').read())
read_sheet = read_from.active
write_to = load_workbook("Master file.xlsx")
write_sheet = write_to[sheetName]
for row in read_sheet.rows:
for cell in row:
new_cell = write_sheet.cell(row=cell.row, column=cell.column,
value= cell.value)
write_sheet.column_dimensions[get_column_letter(cell.column)].width = read_sheet.column_dimensions[get_column_letter(cell.column)].width
if cell.has_style:
new_cell.font = copy(cell.font)
new_cell.border = copy(cell.border)
new_cell.fill = copy(cell.fill)
new_cell.number_format = copy(cell.number_format)
new_cell.protection = copy(cell.protection)
new_cell.alignment = copy(cell.alignment)
write_sheet.merge_cells('C8:G8')
write_sheet.merge_cells('K8:P8')
write_sheet.merge_cells('R8:S8')
write_sheet.add_table(newTable("table1","C10:G76","TableStyleLight8"))
write_sheet.add_table(newTable("table2","K10:P59","TableStyleLight9"))
write_to.save('Master file.xlsx')
read_from.close
With this to check if the sheet already exists:
#checks if sheet already exists and updates sheet if it does.
def checkExists(sheetName):
book = load_workbook("Master file.xlsx") # open an Excel file and return a workbook
if sheetName in book.sheetnames:
print ("Removing sheet",sheetName)
del book[sheetName]
else:
print ("No sheet ",sheetName," found, will create sheet")
book.create_sheet(sheetName)
book.save('Master file.xlsx')
with this to create new tables:
def newTable(tableName,ref,styleName):
tableName = tableName + ''.join(random.choices(string.ascii_uppercase + string.digits + string.ascii_lowercase, k=15))
tab = Table(displayName=tableName, ref=ref)
# Add a default style with striped rows and banded columns
tab.tableStyleInfo = TableStyleInfo(name=styleName, showFirstColumn=False,showLastColumn=False, showRowStripes=True, showColumnStripes=True)
return tab
Adapted from this solution, but note that in my (limited) testing (and as observed in the other Q&A), this does not support the After parameter of the Copy method, only Before. If you try to use After, it creates a new workbook instead.
import xlwings as xw
wb = xw.Book('individual_files\\file1.xlsx')
sht = wb.sheets[0]
new_wb = xw.Book('Master Spreadsheet.xlsx')
# copy this sheet into the new_wb *before* Sheet1:
sht.api.Copy(Before=new_wb.sheets['Sheet1'].api)
# now, remove Sheet1 from new_wb
new_wb.sheets['Sheet1'].delete()
This can be done using pywin32 directly. The Before or After parameter needs to be provided (see the api docs), and the parameter needs to be a worksheet <object>, not simply a worksheet Name or index value. So, for example, to add it to the end of an existing workbook:
def copy_sheet_within_excel_file(excel_filename, sheet_name_or_number_to_copy):
excel_app = win32com_client.gencache.EnsureDispatch('Excel.Application')
wb = excel_app.Workbooks.Open(excel_filename)
wb.Worksheets[sheet_name_or_number_to_copy].Copy(After=wb.Worksheets[wb.Worksheets.Count])
new_ws = wb.ActiveSheet
return new_ws
As most of my code runs on end-user machines, I don't like to make assumptions whether Excel is open or not so my code determines if Excel is already open (see GetActiveObject), as in:
try:
excel_app = win32com_client.GetActiveObject('Excel.Application')
except com_error:
excel_app = win32com_client.gencache.EnsureDispatch('Excel.Application')
And then I also check to see if the workbook is already loaded (see Workbook.FullName). Iterate through the Application.Workbooks testing the FullName to see if the file is already open. If so, grab that wb as your wb handle.
You might find this helpful for digging around the available Excel APIs directly from pywin32:
def show_python_interface_modules():
os.startfile(os.path.dirname(win32com_client.gencache.GetModuleForProgID('Excel.Application').__file__))
So i have the same excel workbook in several different different folders i.e. for each hotel I have a file and in that file there's an excel workbook. Now, I need to go in each file and change the contents of the cell 'B2' in worksheet "Set Up" to the hotel name (which is referred to as hotelname in my code). I try to run the code below but it gives me the error "C:\Python34\python-3.4.4.amd64\lib\site-packages\openpyxl\reader\worksheet.py:319: UserWarning: Data Validation extension is not supported and will be removed" and it doesn't change anything in my excel files?
from openpyxl import load_workbook
log = 'G:\Data\Hotels\hotelnames.txt' ##text file with a list of the hotel names
file = open(log, 'r')
hotelnames = []
line = file.readlines()
for a in line:
hotelnames.append(a.rstrip('\n'))
for hotel in hotelnames:
wb = load_workbook("G:\\Data\\Hotels\\"+hotel+"\\"+hotel+" - Meetings\\"+hotel+"_Action_Log.xlsx", data_only = True)
ws = wb["Set Up"]
ws['B2'] = hotel ### I want this cell to to have that particular hotel name
wb.save
Your code should be calling the method wb.save, not just referencing it. Do that by adding some parentheses and pass the file name to save the file to:
wb.save(filename)
wb.save only references the save method, but does not call it.
Also processing of the input file can be greatly simplified by iterating directly over the file object:
import os.path
with open(log) as f:
file_spec = os.path.join(r"G:\Data\Hotels", '{0}', '{0} - Meetings', '{0}_Action_Log.xlsx'
for hotel in f:
hotel = hotel.rstrip('\n') # probably hotel.rstrip() is sufficient
wb = load_workbook(file_spec.format(hotel), data_only = True)
ws = wb["Set Up"]
ws['B2'] = hotel
wb.save(file_spec.format(hotel)) # careful, this overwrites the original file, it would be safer to save it to a new file.
When one opens Excel workbook with Excel COM object
app = gencache.EnsureDispatch("Excel.Application")
doc = app.Workbooks.Open(filepath)
all print areas are dropped, but they're accessible via VBA when the file is opened normally.
Localized versions of MS Excel ignore print areas and titles when accessed as a COM object, so one must explicitly specify PageSetup.PrintArea|PrintTitleColumns|PrintTitleRows for each worksheet if needed.
for sh in self.doc.Worksheets: #explicitly specify print areas and titles
for name in sh.Names:
if name.Name.endswith("!Print_Area"):
sh.PageSetup.PrintArea = name.RefersTo
elif name.Name.endswith("!Print_Titles"):
#protect comma symbol in sheet name
chunks = name.RefersTo.replace(sheet.Name, "[sheet_name]").split(",")
chunks = [i.replace("[sheet_name]", sheet.Name) for i in chunks]
if len(chunks) == 1:
try: sh.PageSetup.PrintTitleColumns = chunks[0]
except: sh.PageSetup.PrintTitleRows = chunks[0]
else: sh.PageSetup.PrintTitleColumns, sh.PageSetup.PrintTitleRows = chunks
Source: Excel -> PDF (ExportAsFixedFormat)
UPD: Support sheet names with commas