I am trying to write a Python script that will access and modify the active Excel workbook using the Excel COM interface. However, I am having difficulty getting this to work when there are multiple Excel instances running. For example, the code
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
print(xl.ActiveWorkbook.FullName)
prints out the name of the active workbook from the first running instance of Excel only. What I really want is the workbook that I last clicked on, regardless of what Excel instance it was in.
Thanks.
EDIT FOR COMMENTS
There might be a better way to do this.
Install the excellent psutil
import psutil
excelPids = []
for proc in psutil.process_iter():
if proc.name == "EXCEL.EXE": excelPids.append(proc.pid)
Now enumerate the windows, but get the window title and pid.
windowPidsAndTitle = []
win32gui.EnumWindows(lambda hwnd, resultList: resultList.append((win32gui.GetWindowThreadProcessId(hwnd),win32gui.GetWindowText(hwnd))), windowPidsAndTitle)
Now just find the first pid that is in our excelPids
for pid,title in windowPidsAndTitle:
if pid in excelPids:
return title
END EDITS
There is a number of things to take into consideration here:
Does one instance have multiple workbooks open? In this case
xl = win32com.client.Dispatch("Excel.Application")
xl.ActiveWorkbook.FullName
Will indeed give you the last active workbook.
Or are there separate instances of EXCEL.EXE running? You can get each instance with:
xl = win32com.client.GetObjec(None, "Excel.Application") #instance one
xl = win32com.client.GetObject("Name_Of_Workbook") #instance two
But this defeats the purpose because you need to know the name AND this will not tell you which one last had focus.
To #tgrays comment above, if your excel instance is guaranteed to be the foreground window then:
import win32gui
win32gui.GetWindowText(win32gui.GetForegroundWindow())
#parse this and use GetObject to get your excel instance
But worst case scenerio, multiple instances and you have to find which had focus last, you'll have to enumerate all the windows and find the one you care about:
windows = []
win32gui.EnumWindows(lambda hwnd, resultList: resultList.append(win32gui.GetWindowText(hwnd)),windows)
#enumerates all the windows open from the top down
[i for i in windows if "Microsoft Excel" in i].pop(0)
#this one is closest to the top
Good luck with this one!
With xlwings you can simply do:
import xlwings as xw
print(xw.books.active.name)
This will correctly work even if you have multiple instances of Excel open.
Related
I am trying to make a Python script that refreshes a specific file. I have been able to do it with a few sheets but a handful of my sheets it seems to break my Excel. Here is my code:
import win32com.client
import time
# Start an instance of Excel
x1 = win32com.client.DispatchEx("Excel.Application")
# Open the workbook in said instance of Excel
wb = x1.workbooks.open(r"file path")
x1.Visible = True
# Refresh all data connections.
wb.RefreshAll()
x1.CalculateUntilAsyncQueriesDone()
wb.Save()
wb.Close(True)
time.sleep(5)
x1.Quit()
print("Excel Quit")
What happens is right when I get to x1.CalculateUntilAsyncQueriesDone() Excel just spins and whites out and says "Not responding." I've let it run for 15 minutes and nothing. Usually this query takes about 1 minute if I just simply open the spreadsheet and hit refresh all. Also, if I replace x1.CalculateUntilAsyncQueriesDone() with time.sleep(120) the code works perfectly. For some reason that line is breaking the entire process. I don't want to simply use time.sleep though, because sometimes the refresh will take longer or shorter.
Any help anyone can give would be greatly appreciated.
New to Python--
My code wont execute Find and Replace after moving the Sheet.
The goal is to bring a new sheet with formulas, then Find and Replace the reference, in the formulas from the 1st book. This will allow the formulas to be live in the second book.
Here is what I have so far. It returns "No Values were found", But they are there.
Any Point in the right Direction will help!
Various Functions
from win32com.client import Dispatch
path1 = (r'C:Full Path\Book1.xlsx')
path2 = (r'C:\Full Path\Book2.xlsx')
xl = Dispatch("Excel.Application")
xl.Visible = True
wb1 = xl.Workbooks.Open(Filename=path1)
wb2 = xl.Workbooks.Open(Filename=path2)
ws1 = wb1.Worksheets(1)
ws2 = wb2.Worksheets(1)
ws1.Copy(Before=wb2.Worksheets(1))
wb1.Close(SaveChanges=True)
#Cant get this part to work
ws2.Cells.Replace('C:Full Path\[Book1.xlsx]','')
Replace.Execute(ReplaceAll=1, Forward=True)
wb2.Close(SaveChanges=True)
xl.Quit()
I think the issue is letting excel know where to execute the Find and Replace.
Find and Replace are coupled in VBA. Do not be fooled by Python. This is a VBA problem, not a python problem! Once you involve win32com, you have to use VBA Methods and some of the syntax.
Replace cannot work on its own, first, you need to mark a Range with Find that you want to Replace.
Usually it goes
[object].Range.Find
.[Text or Feature to find]
.Replace.[what to replace with]
.Replace.Execute
How to do this in python depends on exactly what you want to find. I do not understand from your Question what that actually would be.
I am using openpyxl to write to a workbook. But that workbook needs to be closed in order to edit it. Is there a way to write to an open Excel sheet? I want to have a button that runs a Python code using the commandline and fills in the cells.
The current process that I have built is using VBA to close the file and then Python writes it and opens it again. But that is inefficient. That is why I need a way to write to open files.
If you're a Windows user there is a very easy way to do this. If we use the Win32 Library we can leverage the built-in Excel Object VBA model.
Now, I am not sure exactly how your data looks or where you want it in the workbook but I'll just assume you want it on the sheet that appears when you open the workbook.
For example, let's imagine I have a Panda's DataFrame that I want to write to an open Excel Workbook. It would like the following:
import win32com.client
import pandas as pd
# Create an instance of the Excel Application & make it visible.
ExcelApp = win32com.client.GetActiveObject("Excel.Application")
ExcelApp.Visible = True
# Open the desired workbook
workbook = ExcelApp.Workbooks.Open(r"<FILE_PATH>")
# Take the data frame object and convert it to a recordset array
rec_array = data_frame.to_records()
# Convert the Recordset Array to a list. This is because Excel doesn't recognize
# Numpy datatypes.
rec_array = rec_array.tolist()
# It will look something like this now.
# [(1, 'Apple', Decimal('2'), 4.0), (2, 'Orange', Decimal('3'), 5.0), (3, 'Peach',
# Decimal('5'), 5.0), (4, 'Pear', Decimal('6'), 5.0)]
# set the value property equal to the record array.
ExcelApp.Range("F2:I5").Value = rec_array
Again, there are a lot of things we have to keep in mind as to where we want it pasted, how the data is formatted and a whole host of other issues. However, at the end of the day, it is possible to write to an open Excel file using Python if you're a Windows' user.
Generally, two different processes shouldn't not be writing to the same file because it will cause synchronization issues.
A better way would be to close the existing file in parent process (aka VBA code) and pass the location of the workbook to python script.
The python script will open it and write the contents in the cell and exit.
No this is not possible because Excel files do not support concurrent access.
I solved this doing the follow: Create an intermediary excel file to recieve data from python and then create a connexion between this file and the main file. The excel has a tool that allow automatically refresh imported data from another workbook. Look this LINK
wb = openpyxl.load_workbook(filename='meanwhile.xlsm', read_only=False, keep_vba=True)
...
wb.save('meanwhile.xlsm')
In sequence open your main excel file:
On the Data tab, create a connexion with the "meanwhile" workbook, then in the Connections group, click the arrow next to Refresh, and then click Connection Properties.
Click the Usage tab.
Select the Refresh every check box, and then enter the number of minutes between each refresh operation.
Using below code I have achieved, writing the Excel file using python while it is open in MS Execl.
This solution is for Windows OS, not sure for others.
from kiteconnect import KiteConnect
import xlwings as xw
wb = xw.Book('winwin_safe_trader_youtube_watchlist.xlsx')
sht = wb.sheets['Sheet1']
stocks_list = sht.range('A2').expand("down").value
watchlist = []
time.sleep(10)
for name in stocks_list:
symbol = "NSE:" + name
watchlist.append(symbol)
print(datetime.datetime.today().time())
data = kite.quote(watchlist)
df = pd.DataFrame(data).transpose()
df = df.drop(['depth', 'ohlc'], 1)
print(df)
sht.range('B1').value = df
time.sleep(1)
wb.save('winwin_safe_trader_youtube.xlsx')
I am trying to identify all cells that contain external workbook references, using openpyxl in Python 3.4. But I am failing. My first try consisted of:
def find_external_value(cell):
# identifies an external link in a given cell
if '.xls' in cell.value:
has_external_reference = True
return has_external_value
However, when I print the cell values that have external values to the console, it yields this:
=[1]Sheet1!$B$4
=[2]Sheet1!$B$4
So, openpyxl obviously does not parse formulas containing external values in the way I imagined and since square brackets are used for table formulas, there is no sense in trying to pick up on external links in this manner.
I dug a little deeper and found the detect_external_links function in the openpyxl.workbook.names.external module (reference). I have no idea if one can actually call this function to do what I want.
From the console results it seems as if openpyxl understands that there are references, and seems to contain them in a list of sorts. But can one access this list? Or detect if such a list exists?
Whichever way - all I need is to figure out if a cell contains a link to an external workbook.
I have found a solution to this.
Use the openpyxl library for load the xlsx file as
import openpyxl
wb=openpyxl.load_workbook("Myworkbook.xlsx")
"""len(wb._external_links) *Add this line to get count of linked workbooks*"""
items=wb._external_links
for index, item in enumerate(items):
Mystr =wb._external_links[index].file_link.Target
Mystr=Mystr.replace("file:///","")
print(Mystr.replace("%20"," "))
----------------------------
Out[01]: ##Indicates that the workbook has 4 external workbook links##
/Users/myohannan/AppData/Local/Temp/49/orion/Extension Workpapers_Learning Extension Calc W_83180610.xlsx
/Users/lmmeyer/AppData/Local/Temp/orion/Complete Set of Workpapers_PPS Workpapers 123112_111698213.xlsx
\\SF-DATA-2\IBData\TEMP\ie5\Temporary Internet Files\OLK8A\LBO Models\PIGLET Current.xls
/WINNT/Temporary Internet Files/OLK3/WINDOWS/Temporary Internet Files/OLK8304/DEZ.XLS
I decided to veer outside of openpyxl in order to achieve my goal - even though openpyxl has numerous functions that refer to external links I was unable to find a simple way to achieve my goal.
Instead I decided to use ZipFile to open the workbook in memory, then search for the externalLink1.xml file. If it exists, then the workbook contains external links:
import tkinter as tk
from tkinter import filedialog
from zipfile import ZipFile
Import xml.etree.ElementTree
root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()
with ZipFile(file_path) as myzip:
try:
my_file = myzip.open('xl/externalLinks/externalLink1.xml')
e = xml.etree.ElementTree.parse(my_file).getroot()
print('Has external references')
except:
print('No external references')
Once I have the XML file, I can proceed to identify the cell address, value and other information by running through the XML tree using ElementTree.
I have been able to get the column to output the values of the column in a separated list. However I need to retain these values and use them one by one to perform an Amazon lookup with them. The amazon lookup is not the problem. Getting XLRD to give one value at a time has been a problem. Is there also an efficient method of setting a time in Python? The only answer I have found to the timer issue is recording the time the process started and counting from there. I would prefer just a timer. This question is somewhat two parts here is what I have done so far.
I load the spreadsheet with xlrd using argv[1] i copy it to a new spreadsheet name using argv[2]; argv[3] i need to be the timer entity however I am not that far yet.
I have tried:
import sys
import datetime
import os
import xlrd
from xlrd.book import colname
from xlrd.book import row
import xlwt
import xlutils
import shutil
import bottlenose
AMAZON_ACCESS_KEY_ID = "######"
AMAZON_SECRET_KEY = "####"
print "Executing ISBN Amazon Lookup Script -- Please be sure to execute it python amazon.py input.xls output.xls 60(seconds between database queries)"
print "Copying original XLS spreadsheet to new spreadsheet file specified as the second arguement on the command line."
print "Loading Amazon Account information . . "
amazon = bottlenose.Amazon(AMAZON_ACCESS_KEY_ID, AMAZON_SECRET_KEY)
response = amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
shutil.copy2(sys.argv[1], sys.argv[2])
print "Opening copied spreadsheet and beginning ISBN extraction. . ."
wb = xlrd.open_workbook(sys.argv[2])
print "Beginning Amazon lookup for the first ISBN number."
for row in colname(colx=2):
print amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
I know this is a little vague. Should I perhaps try doing something like column = colname(colx=2) then i could do for row in column: Any help or direction is greatly appreciated.
The use of colname() in your code is simply going to return the name of the column (e.g. 'C' by default in your case unless you've overridden the name). Also, the use of colname is outside the context of the contents of your workbook. I would think you would want to work with a specific sheet from the workbook you are loading, and from within that sheet you would want to reference the values of a column (2 in the case of your example), does this sound somewhat correct?
wb = xlrd.open_workbook(sys.argv[2])
sheet = wb.sheet_by_index(0)
for row in sheet.col(2):
print amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
Although I think looking at the call to amazon.ItemLookup() you probably want to refer to row and not to "row" as the latter is simply a string and the former is the actual contents of the variable named row from your for loop.