Connecting Excel with Python

Connecting Excel with Python - python

Using code below, I can get the data to print.
How would switch code to xlrd?
How would modify this code to use a xls file that is already open and visible.
So, file is open first manually, then script runs.
And, gets updated.
and then get pushed into Mysql
import os
from win32com.client import constants, Dispatch
import numpy as np
#----------------------------------------
# get data from excel file
#----------------------------------------
XLS_FILE = "C:\\xtest\\example.xls"
ROW_SPAN = (1, 16)
COL_SPAN = (1, 6)
app = Dispatch("Excel.Application")
app.Visible = True
ws = app.Workbooks.Open(XLS_FILE).Sheets(1)
xldata = [[ws.Cells(row, col).Value
for col in xrange(COL_SPAN[0], COL_SPAN[1])]
for row in xrange(ROW_SPAN[0], ROW_SPAN[1])]
#print xldata
a = np.asarray(list(xldata), dtype='object')
print a

If you mean that you want to modify the current file, I'm 99% sure that is not possible and 100% sure that it is a bad idea. In order to alter a file, you need to have write permissions. Excel creates a file lock to prevent asynchronous and simultaneous editing. If a file is open in Excel, then the only thing which should be modifying that file is... Excel.
If you mean that you want to read the file currently in the editor, then that is possible -- you can often get read access to a file in use, but it is similarly unwise -- if the user hasn't saved, then the user will see one set of data, and you'll have another set of data on disk.
While I'm not a fan of VB, that is a far better bet for this application -- use a macro to insert the data into MySQL directly from Excel. Personally, I would create a user with insert privileges only, and then I would try this tutorial.

If you want to manipulate an already open file, why not use COM?
http://snippets.dzone.com/posts/show/2036
http://oreilly.com/catalog/pythonwin32/chapter/ch12.html

Related

How to create an effective logging system using excel, pandas, and numpy

Background:
I am creating a program that will need to keep track of what it has ran and when and what it has sent and to whom. The logging module in Python doesn't appear to accomplish what I need but I'm still pretty new to this so I may be wrong. Alternative solutions to accomplish the same end are also welcome.
The program will need to take in a data file (preferably .xlsx or .csv) which will be formatted as something like this (the Nones will need to be filled in by the program):
Run_ID
Date_Requested
Time_Requested
Requestor
Date_Completed
Time_Completed
R_423h
9/8/2022
1806
email#email.com
None
None
The program will then need to compare the Run_IDs from the log to the new run_IDs provided in a similar format to the table above (in a .csv) ie:
ResponseId
R_jals893
R_hejl8234
I can compare the IDs myself, but the issue then becomes it will need to update the log with the new IDs it has ran, along with the times they were run and the emails and such, and then resave the log file. I'm sure this is easy but it's throwing me for a loop.
My code:
log = pd.read_excel('run_log.xlsx', usecols=None, parse_dates=True)
new_run_requests=pd.read_csv('Run+Request+Sheet_September+6,+2022_14.28.csv',parse_dates=True)
old_runs = log.Run_ID[:]
new_runs = new_run_requests.ResponseId[:]
log['Run_ID'] = pd.concat([old_runs, new_runs], ignore_index=True)
After this the dataframe does not change.
This is one of the things I have tried out of 2 or 3. Suggestions are appreciated!

Export data from MSSQL to Excel 'template' saving with a new name using Python

I am racking my brain here and have read a lot of tutorials, sites, sample code, etc. Something is not clicking for me.
Here is my desired end state.
Select data from MSSQL - Sorted, not a problem
Open an Excel template (xlsx file) - Sorted, not a problem
Export data to this Excel template and saving it with a different name - PROBLEM.
What I have achieved so far: (this works)
I can extract data from DB.
I can write that data to Excel using pandas, my line of code for doing that is: pd.read_sql(script,cnxn).to_excel(filename,sheet_name="Sheet1",startrow=19,encoding="utf-8")
filename variable is a new file that I create every time the for loop runs.
What my challenge is:
The data needs to be export to a predefined template (template has formatting that must be present in every file)
I can open the file and I can write to the file, but I do not know how to save that file with a different name through every iteration of the for loop
In my for loop I use this code:
#this does not work
pd.read_sql(script,cnxn)
writer = pd.ExcelWriter(SourcePath) #opens the source document
df.to_excel(writer)
writer.save() #how to I saveas() a different file name?????
Your help would be highly appreciated.

Your method is work. The problem is you don't need to write the data into excel file right after you read the data from the database. My suggestion is first read the data into different data frame.
df1 = pd.read_sql(script)
df2 = pd.read_sql(script)
df3 = pd.read_sql(script)
You can then write all the dataframe together to a excel file. You can refer to this link.
I hope this solution can help you. Have a nice weekend

How to put a value to an excel cell?

You'll probably laugh at me, but I am sitting on this for two weeks. I'm using python with pandas.
All I want to do, is to put a calculated value in a pre-existing excel file to a specific cell without changing the rest of the file. That's it.
Openpyxl makes my file unusable (means, I can not open because it's "corrupted" or something) or it plainly delets the whole content of the file. Xlsxwriter cannot read or modify pre-existing files. So it has to be pandas.
And for some reason I can't use worksheet = writer.sheets['Sheet1'], because that leads to an "unhandled exception".
Guys. Help.

I tried a bunch of packages but (for a lot of reasons) I ended up using xlwings. You can do pretty much anything with it in python that you can do in Excel.
Documentation link
So with xlwings you'd have:
import xlwings as xw
# open app_excel
app_excel = xw.App(visible = False)
# open excel template
wbk = xw.Book( r'stuff.xlsx' )
# write to a cell
wbk.sheets['Sheet1'].range('B5').value = 15
# save in the same place with the same name or not
wbk.save()
wbk.save( r'things.xlsx' )
# kill the app_excel
app_excel.kill()
del app_excel
Let me know how it goes.

How to write to an open Excel file using Python?

I am using openpyxl to write to a workbook. But that workbook needs to be closed in order to edit it. Is there a way to write to an open Excel sheet? I want to have a button that runs a Python code using the commandline and fills in the cells.
The current process that I have built is using VBA to close the file and then Python writes it and opens it again. But that is inefficient. That is why I need a way to write to open files.

If you're a Windows user there is a very easy way to do this. If we use the Win32 Library we can leverage the built-in Excel Object VBA model.
Now, I am not sure exactly how your data looks or where you want it in the workbook but I'll just assume you want it on the sheet that appears when you open the workbook.
For example, let's imagine I have a Panda's DataFrame that I want to write to an open Excel Workbook. It would like the following:
import win32com.client
import pandas as pd
# Create an instance of the Excel Application & make it visible.
ExcelApp = win32com.client.GetActiveObject("Excel.Application")
ExcelApp.Visible = True
# Open the desired workbook
workbook = ExcelApp.Workbooks.Open(r"<FILE_PATH>")
# Take the data frame object and convert it to a recordset array
rec_array = data_frame.to_records()
# Convert the Recordset Array to a list. This is because Excel doesn't recognize
# Numpy datatypes.
rec_array = rec_array.tolist()
# It will look something like this now.
# [(1, 'Apple', Decimal('2'), 4.0), (2, 'Orange', Decimal('3'), 5.0), (3, 'Peach',
# Decimal('5'), 5.0), (4, 'Pear', Decimal('6'), 5.0)]
# set the value property equal to the record array.
ExcelApp.Range("F2:I5").Value = rec_array
Again, there are a lot of things we have to keep in mind as to where we want it pasted, how the data is formatted and a whole host of other issues. However, at the end of the day, it is possible to write to an open Excel file using Python if you're a Windows' user.

Generally, two different processes shouldn't not be writing to the same file because it will cause synchronization issues.
A better way would be to close the existing file in parent process (aka VBA code) and pass the location of the workbook to python script.
The python script will open it and write the contents in the cell and exit.

No this is not possible because Excel files do not support concurrent access.

I solved this doing the follow: Create an intermediary excel file to recieve data from python and then create a connexion between this file and the main file. The excel has a tool that allow automatically refresh imported data from another workbook. Look this LINK
wb = openpyxl.load_workbook(filename='meanwhile.xlsm', read_only=False, keep_vba=True)
...
wb.save('meanwhile.xlsm')
In sequence open your main excel file:
On the Data tab, create a connexion with the "meanwhile" workbook, then in the Connections group, click the arrow next to Refresh, and then click Connection Properties.
Click the Usage tab.
Select the Refresh every check box, and then enter the number of minutes between each refresh operation.

Using below code I have achieved, writing the Excel file using python while it is open in MS Execl.
This solution is for Windows OS, not sure for others.
from kiteconnect import KiteConnect
import xlwings as xw
wb = xw.Book('winwin_safe_trader_youtube_watchlist.xlsx')
sht = wb.sheets['Sheet1']
stocks_list = sht.range('A2').expand("down").value
watchlist = []
time.sleep(10)
for name in stocks_list:
symbol = "NSE:" + name
watchlist.append(symbol)
print(datetime.datetime.today().time())
data = kite.quote(watchlist)
df = pd.DataFrame(data).transpose()
df = df.drop(['depth', 'ohlc'], 1)
print(df)
sht.range('B1').value = df
time.sleep(1)
wb.save('winwin_safe_trader_youtube.xlsx')

Calculating Excel sheets without opening them (openpyxl or xlwt)

I made a script that opens a .xls file, writes a few new values in it, then saves the file.
Later, the script opens it again, and wants to find the answers in some cells which contain formulas.
If I call that cell with openpyxl, I get the formula (ie: "=A1*B1").
And if I activate data_only, I get nothing.
Is there a way to let Python calculate the .xls file? (or should I try PyXll?)

I realize this question is old, but I ran into the same problem and extensive searching didn't produce an answer.
The solution is in fact quite simple so I will post it here for posterity.
Let's assume you have an xlsx file that you have modified with openpyxl. As Charlie Clark mentioned openpyxl will not calculate the formulas, but if you were to open the file in excel the formulas would be automatically calculated. So all you need to do is open the file and then save it using excel.
To do this you can use the win32com module.
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
workbook = excel.Workbooks.Open(r'absolute/path/to/your/file')
# this must be the absolute path (r'C:/abc/def/ghi')
workbook.Save()
workbook.Close()
excel.Quit()
That's it. I've seen all these suggestions to use Pycel or Koala, but that seems like a bit of overkill if all you need to do is tell excel to open and save.
Granted this solution is only for windows.

There is actually a project that takes Excel formulas and evaluates them using Python: Pycel. Pycel uses Excel itself (via COM) to extract the formulas, so in your case you would skip that part. The project probably has something useful that you can use, but I can't vouch for its maturity or completeness. It was not really developed for the general public.
There is also a newer project called Koala which builds on both Pycel and OpenPyXL.
Another approach, if you can't use Excel but you can calculate the results of the formulas yourself (in your Python code), is to write both the value and the formula into a cell (so that when you read the file, you can just pull the value, and not worry about the formula at all). As of this writing, I haven't found a way to do it in OpenPyXL, but XlsxWriter can do it. From the documentation:
XlsxWriter doesn’t calculate the value of a formula and instead stores the value 0 as the formula result. It then sets a global flag in the XLSX file to say that all formulas and functions should be recalculated when the file is opened. This is the method recommended in the Excel documentation and in general it works fine with spreadsheet applications. However, applications that don’t have a facility to calculate formulas, such as Excel Viewer, or some mobile applications will only display the 0 results.
If required, it is also possible to specify the calculated result of the formula using the options value parameter. This is occasionally necessary when working with non-Excel applications that don’t calculate the value of the formula. The calculated value is added at the end of the argument list:
worksheet.write_formula('A1', '=2+2', num_format, 4)
With this approach, when it's time to read the value, you would use OpenPyXL's data_only option. (For other people reading this answer: If you use xlrd, then only the value is available anyway.)
Finally, if you do have Excel, then perhaps the most straightforward and reliable thing you can do is automate the opening and resaving of your file in Excel (so that it will calculate and write the values of the formulas for you). xlwings is an easy way to do this from either Windows or Mac.

The formula module works for me. For detail please refer to https://pypi.org/project/formulas/
from openpyxl import load_workbook
import formulas
#The variable spreadsheet provides the full path with filename to the excel spreadsheet with unevaluated formulae
fpath = path.basename(spreadsheet)
dirname = path.dirname(spreadsheet)
xl_model = formulas.ExcelModel().loads(fpath).finish()
xl_model.calculate()
xl_model.write(dirpath=dirname)
#Use openpyxl to open the updated excel spreadsheet now
wb = load_workbook(filename=spreadsheet,data_only=True)
ws = wb.active

I run into the same problem, and after some time researching I ended up using pyoo ( https://pypi.org/project/pyoo/ ) which is for openoffice/libreoffice so available in all platforms and is more straightforward since communicates natively and doesn't require to save/close the file . I tried several other libraries but found the following problems
xlswings: Only works if you have Excel installed and Windows/MacOS so I couldn't evaluate
koala : Seems that it's broken, after networkx 2.4 update.
openpyxl: As pointed out by others, it isn't able to calculate formulas so I was looking into combining it with pycel to get values. I didn 't finally tried because I found pyoo . Openpyxl+pycel might not work as of now, since pycel is also relying on networkx library.

No, and in openpyxl there will never be. I think there is a Python library that purports to implements an engine for such formualae which you can use.

xlcalculator can do this job. https://github.com/bradbase/xlcalculator
from xlcalculator import ModelCompiler
from xlcalculator import Model
from xlcalculator import Evaluator
filename = r'use_case_01.xlsm'
compiler = ModelCompiler()
new_model = compiler.read_and_parse_archive(filename)
evaluator = Evaluator(new_model)
# First!A2
# value is 0.1
#
# Fourth!A2
# formula is =SUM(First!A2+1)
val1 = evaluator.evaluate('Fourth!A2')
print("value 'evaluated' for Fourth!A2:", val1)
evaluator.set_cell_value('First!A2', 88)
# now First!A2 value is 88
val2 = evaluator.evaluate('Fourth!A2')
print("New value for Fourth!A2 is", val2)
Which results in the following output;
file_name use_case_01.xlsm ignore_sheets []
value 'evaluated' for Fourth!A2: 1.1
New value for Fourth!A2 is 89

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.