How to write to an open Excel file using Python? - python

I am using openpyxl to write to a workbook. But that workbook needs to be closed in order to edit it. Is there a way to write to an open Excel sheet? I want to have a button that runs a Python code using the commandline and fills in the cells.
The current process that I have built is using VBA to close the file and then Python writes it and opens it again. But that is inefficient. That is why I need a way to write to open files.

If you're a Windows user there is a very easy way to do this. If we use the Win32 Library we can leverage the built-in Excel Object VBA model.
Now, I am not sure exactly how your data looks or where you want it in the workbook but I'll just assume you want it on the sheet that appears when you open the workbook.
For example, let's imagine I have a Panda's DataFrame that I want to write to an open Excel Workbook. It would like the following:
import win32com.client
import pandas as pd
# Create an instance of the Excel Application & make it visible.
ExcelApp = win32com.client.GetActiveObject("Excel.Application")
ExcelApp.Visible = True
# Open the desired workbook
workbook = ExcelApp.Workbooks.Open(r"<FILE_PATH>")
# Take the data frame object and convert it to a recordset array
rec_array = data_frame.to_records()
# Convert the Recordset Array to a list. This is because Excel doesn't recognize
# Numpy datatypes.
rec_array = rec_array.tolist()
# It will look something like this now.
# [(1, 'Apple', Decimal('2'), 4.0), (2, 'Orange', Decimal('3'), 5.0), (3, 'Peach',
# Decimal('5'), 5.0), (4, 'Pear', Decimal('6'), 5.0)]
# set the value property equal to the record array.
ExcelApp.Range("F2:I5").Value = rec_array
Again, there are a lot of things we have to keep in mind as to where we want it pasted, how the data is formatted and a whole host of other issues. However, at the end of the day, it is possible to write to an open Excel file using Python if you're a Windows' user.

Generally, two different processes shouldn't not be writing to the same file because it will cause synchronization issues.
A better way would be to close the existing file in parent process (aka VBA code) and pass the location of the workbook to python script.
The python script will open it and write the contents in the cell and exit.

No this is not possible because Excel files do not support concurrent access.

I solved this doing the follow: Create an intermediary excel file to recieve data from python and then create a connexion between this file and the main file. The excel has a tool that allow automatically refresh imported data from another workbook. Look this LINK
wb = openpyxl.load_workbook(filename='meanwhile.xlsm', read_only=False, keep_vba=True)
...
wb.save('meanwhile.xlsm')
In sequence open your main excel file:
On the Data tab, create a connexion with the "meanwhile" workbook, then in the Connections group, click the arrow next to Refresh, and then click Connection Properties.
Click the Usage tab.
Select the Refresh every check box, and then enter the number of minutes between each refresh operation.

Using below code I have achieved, writing the Excel file using python while it is open in MS Execl.
This solution is for Windows OS, not sure for others.
from kiteconnect import KiteConnect
import xlwings as xw
wb = xw.Book('winwin_safe_trader_youtube_watchlist.xlsx')
sht = wb.sheets['Sheet1']
stocks_list = sht.range('A2').expand("down").value
watchlist = []
time.sleep(10)
for name in stocks_list:
symbol = "NSE:" + name
watchlist.append(symbol)
print(datetime.datetime.today().time())
data = kite.quote(watchlist)
df = pd.DataFrame(data).transpose()
df = df.drop(['depth', 'ohlc'], 1)
print(df)
sht.range('B1').value = df
time.sleep(1)
wb.save('winwin_safe_trader_youtube.xlsx')

Related

How to filter in Excel using Python?

I am trying to automate a macro-enabled Excel workbook process using Python. I would like to use win32com if possible but am open to switching to other libraries if needed.
Once I get the workbook open and on the sheet I need, there is data already there with auto-filters applied. I just need to filter on a column to make the data available to the macro when I run it.
I use wb.RefreshAll() to import the data from existing connections. Eventually I will need to pass a value entered by the user to the filter as it will be different each time the automation runs.
Most solutions involve copying select data to a Pandas DataFrame etc. but I need the filtered data to remain in the sheet so it can be used by the macro.
I recently wrote a Python script to automate Macros. Basically, the idea was to batch-edit a collection of .doc files and have a macro run for all of them, without having to open them one by one.
First:
What is this macro you want to run?
Second:
What is the data that you need to make visible, and what do you mean by that?
To get you started, try this:
data = [{"test1": 1, "test2": 2}, {"test1": 3, "test2": 4}]
import win32com.client as win32
def openExcel():
xl = win32.gencache.EnsureDispatch('Excel.Application')
wb = xl.Workbooks.Add()
#wb = xl.Workbooks.Open(filepath)
ws = wb.Sheets(1) #The worksheet you want to edit later
xl.Visible = True
return ws
def print2Excel(datapoint:dict, ws):
print(datapoint)
const = win32.constants #.Insert()-Methods keywargs are packaged into const.
ws.Range("A1:B1").Insert(const.xlShiftDown, const.xlFormatFromRightOrBelow)
ws.Cells(1,1).Value = datapoint["test1"]
ws.Cells(1,2).Value = datapoint["test2"]
ws = openExcel() #<- When using Open(filepath), pass the whole filepath, starting at C:\ or whatever drive.
for datapoint in data:
print2Excel(datapoint, ws)
This showcases some of the basics on how to work with Excel objects in win32com

How to split an Excel workbook by worksheet while preserving grouping

I am doing some excel reports for work and am given a book exported from SSRS daily. The book is nicely set up, with groupings applied to every sheet for an effect similar to pivot tables.
However the book comes with 32 sheets, and I eventually need to send out each sheet individually as a distinct report. Right now I am splitting them up manually, but I am wondering if there is a way to automate this while preserving the grouping.
I previously tried something like:
import xlrd
import pandas as pd
targetWorkbook = xlrd.open_workbook(r'report.xlsx', on_demand=True)
xlsxDoc = pd.ExcelFile('report.xlsx')
for sheet in targetWorkbook.sheet_names():
reportDF = pd.read_excel(xlsxDoc, sheet)
reportDF.to_excel("report - {}.xlsx".format(sheet))
However since I'm converting each sheet to a pandas datagrams, the grouping is lost.
There are multiple ways to read/interact with excel docs in python, but I can't find a clear way to pick out a sheet and save it as its own document without losing the grouping.
This is my full answer. I have used the Worksheets().Move() method. The main idea is to use win32com.client library.
This was tested and works on my Windows 10 system with Excel 2013 installed, and Python 3.7. The grouping format was moved intact with the worksheets. I am still working on getting the looping to work. I will revise my answer again when I get the looping to work.
My example has 3 worksheets, each with different grouping (subtotal) formats.
#
# Refined .Move() method, save new file using Active Worksheet property.
#
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb0 = excel.Workbooks.Open(r'C:\python\so\original.xlsx')
excel.Visible = True
# Move sheet1.
wb0.Worksheets(1).Move()
excel.Application.ActiveWorkbook.SaveAs(r'C:\python\so\sheet1.xlsx')
# Move sheet2, which is now the front sheet.
wb0.Worksheets(1).Move()
excel.Application.ActiveWorkbook.SaveAs(r'C:\python\so\sheet2.xlsx')
# Save single remaining sheet as sheet3.
wb0.SaveAs(r'C:\python\so\sheet3.xlsx')
wb0.Close()
excel.Application.Quit()
You would also need to install pywin32, which is not a standard library item.
https://github.com/mhammond/pywin32
pip install pywin32

How to put a value to an excel cell?

You'll probably laugh at me, but I am sitting on this for two weeks. I'm using python with pandas.
All I want to do, is to put a calculated value in a pre-existing excel file to a specific cell without changing the rest of the file. That's it.
Openpyxl makes my file unusable (means, I can not open because it's "corrupted" or something) or it plainly delets the whole content of the file. Xlsxwriter cannot read or modify pre-existing files. So it has to be pandas.
And for some reason I can't use worksheet = writer.sheets['Sheet1'], because that leads to an "unhandled exception".
Guys. Help.
I tried a bunch of packages but (for a lot of reasons) I ended up using xlwings. You can do pretty much anything with it in python that you can do in Excel.
Documentation link
So with xlwings you'd have:
import xlwings as xw
# open app_excel
app_excel = xw.App(visible = False)
# open excel template
wbk = xw.Book( r'stuff.xlsx' )
# write to a cell
wbk.sheets['Sheet1'].range('B5').value = 15
# save in the same place with the same name or not
wbk.save()
wbk.save( r'things.xlsx' )
# kill the app_excel
app_excel.kill()
del app_excel
Let me know how it goes.

From Python web app: insert data into spreadsheet (e.g. LibreOffice / Excel), calculate and save as pdf

I am facing the problem, that I would like to push data (one large dataframe and one image) from my python web app (running on Tornado Webserver and Ubuntu) into a spreadsheet, calculate, save as pdf and the deliver to the frontend.
I took a look at several libs like openpyxl for writing Sheets in MS Excel, but that would solve just one part. I was thinking about using LibreOffice and pyoo, but it seems that I need the same python version on my backend as shipped with LibeOffice when importing pyuno.
Does somebody has solved a similar issue and have a recommendation how to solve this?
Thanks
I came up to a let's say not pretty, but rare solution that works very flexible for me.
use openpyxl to open an existing Excel workbook that includes layout (Template)
insert the dataframe into a separate sheet in that workbook
use openpyxl to save as temporary_file.xlsx
call LibeOffice with --headless --convert-to pdf temporary_file.xlsx
While executing the last call, all integrated formulas are recalculated/updated and the pdf is created (you have to configure calc so that auto calc is enabled when files are opened)
deliver pdf to frontend or process as you like
delete temporary_file.xlsx
import openpyxl
import pandas as pd
from subprocess import call
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
now = datetime.datetime.now().strftime("%Y%m%d_%H%M_%f")
wb_template_name = 'Template.xlsx'
wb_temp_name = now + wb_template_name
wb = openpyxl.load_workbook(wb_template_name)
ws = wb['dataframe_sheet']
pdf_convert_cmd = 'soffice --headless --convert-to pdf ' + wb_temp_name
for r in dataframe_to_rows(df, index=True, header=True):
ws.append(r)
wb.save(wb_temp_name)
call(pdf_convert_cmd, shell=True)
The reason why I'm doing this, is that I would like to be able to style the layout of the pdf independently from the data. I use named ranges or lookups that are referenced to the separate dataframe-sheet in excel.
I didn't try the image insertion yet, but this should work similar. I think there could be a way to increase the performance while simply dump the dataframe into the xlsx file (which is a zipped file of xmls), so that you don't need openpyxl.

Connecting Excel with Python

Using code below, I can get the data to print.
How would switch code to xlrd?
How would modify this code to use a xls file that is already open and visible.
So, file is open first manually, then script runs.
And, gets updated.
and then get pushed into Mysql
import os
from win32com.client import constants, Dispatch
import numpy as np
#----------------------------------------
# get data from excel file
#----------------------------------------
XLS_FILE = "C:\\xtest\\example.xls"
ROW_SPAN = (1, 16)
COL_SPAN = (1, 6)
app = Dispatch("Excel.Application")
app.Visible = True
ws = app.Workbooks.Open(XLS_FILE).Sheets(1)
xldata = [[ws.Cells(row, col).Value
for col in xrange(COL_SPAN[0], COL_SPAN[1])]
for row in xrange(ROW_SPAN[0], ROW_SPAN[1])]
#print xldata
a = np.asarray(list(xldata), dtype='object')
print a
If you mean that you want to modify the current file, I'm 99% sure that is not possible and 100% sure that it is a bad idea. In order to alter a file, you need to have write permissions. Excel creates a file lock to prevent asynchronous and simultaneous editing. If a file is open in Excel, then the only thing which should be modifying that file is... Excel.
If you mean that you want to read the file currently in the editor, then that is possible -- you can often get read access to a file in use, but it is similarly unwise -- if the user hasn't saved, then the user will see one set of data, and you'll have another set of data on disk.
While I'm not a fan of VB, that is a far better bet for this application -- use a macro to insert the data into MySQL directly from Excel. Personally, I would create a user with insert privileges only, and then I would try this tutorial.
If you want to manipulate an already open file, why not use COM?
http://snippets.dzone.com/posts/show/2036
http://oreilly.com/catalog/pythonwin32/chapter/ch12.html

Categories