I am new to Python and working on a project that I could use some help on. So I am trying to modify an existing excel workbook in order to compare stock data. Luckily, there was a program online that retrieved all the data I need and I have successful been able to pull the data and write the data into a new excel file. However, the goal is to pull the data and put it into an existing excel file. Furthermore, I need to overwrite the cell values in the existing file. I believe xlwings is able to do this and I think my code is on the right track, but I ran into an unexpected error. The error I get is:
com_error: (-2147352570, 'Unknown name.', None, None)
I was wondering if anyone knew why this error came up? Also, does anyone know how to fix it? Is it fixable? Is my code wrong? Any help or guidance is appreciated. Thank you.
import good_morning as gm
import pandas as pd
import xlwings as xw
#import income statement, balance sheet, and cash flow of AAPL
fd = gm.FinancialsDownloader()
fd_frames = fd.download('AAPL')
#Creates a DataFrame for only the balance sheet
df1 = pd.DataFrame(list(fd_frames.values())[0])
#connects to workbook I want to modify (this is where I get the commerror)
wb = xw.Book(r'C:/Users/vince/Project/Spreadsheet.xlsm')
#sheet I would like to modify
sht = wb.sheets[1]
#modifies & overwrites values in my spreadsheet
sht.range('M6').value = df1
Related
I am currently using the xlwings package to manipulate an excel file. So far, this package is fantastic, but I do not find a lot of specific documentation on how to modify a pivot table. My main question is how can I change the data source for a specific pivot table?
I think this is a general question so I won't provide any kind of code or excel file.
Thank you for the help.
Ok, a friend of mine found this website with an answer https://blog.csdn.net/weixin_39906906/article/details/111374735
but since it's in Chinese, I will post the necessary code below.
I don't understand everything about this answer. The win32c object is unknown to me and therefore, I am not comfortable in explaining everything.
import xlwings as xw
import win32com.client as win32
win32c = win32.constants
# open you excel workbook
wb = xw.Book('excel.xlsx')
# select sheet containing the pivot table
sheet_with_pivot_table = wb.sheets['sheet_pivot_table']
# Write the data range as written in the excel app
data_range = 'sheet_with_data!$A$1:$D$4'
# get the pivot table
pivot_table = sheet_with_pivot_table.api.PivotTables('pivot_table_name')
# This applies the new data
pivot_table.ChangePivotCache(wb.api.PivotCaches().Create(SourceType=win32c.xlDatabase, SourceData=data_range, Version=win32c.xlPivotTableVersion12))
Hopefully, this helps someone else.
See in this example, with the pivotname that you can change.
import xlwings as xw
app_excel = xw.App(visible = False)
wbook = xw.Book( 'Excelfile.xlsx' )
wbook.sheets['datatab'].select()
wbook.api.ActiveSheet.PivotTables('pivotname').PivotCache().refresh()
You'll probably laugh at me, but I am sitting on this for two weeks. I'm using python with pandas.
All I want to do, is to put a calculated value in a pre-existing excel file to a specific cell without changing the rest of the file. That's it.
Openpyxl makes my file unusable (means, I can not open because it's "corrupted" or something) or it plainly delets the whole content of the file. Xlsxwriter cannot read or modify pre-existing files. So it has to be pandas.
And for some reason I can't use worksheet = writer.sheets['Sheet1'], because that leads to an "unhandled exception".
Guys. Help.
I tried a bunch of packages but (for a lot of reasons) I ended up using xlwings. You can do pretty much anything with it in python that you can do in Excel.
Documentation link
So with xlwings you'd have:
import xlwings as xw
# open app_excel
app_excel = xw.App(visible = False)
# open excel template
wbk = xw.Book( r'stuff.xlsx' )
# write to a cell
wbk.sheets['Sheet1'].range('B5').value = 15
# save in the same place with the same name or not
wbk.save()
wbk.save( r'things.xlsx' )
# kill the app_excel
app_excel.kill()
del app_excel
Let me know how it goes.
I have been working on this for too long now. I have an Excel with one sheet (sheetname = 'abc') with images in it and I want to have a Python script that writes a dataframe on a second separate sheet (sheetname = 'def') in the same excel file. Can anybody provide me with some example code, because everytime I try to write the dataframe, the first sheet with the images gets emptied.
This is what I tried:
book = load_workbook('filename_of_file_with_pictures_in_it.xlsx')
writer = pd.ExcelWriter('filename_of_file_with_pictures_in_it.xlsx', engine = 'openpyxl')
writer.book = book
x1 = np.random.randn(100, 2)
df = pd.DataFrame(x1)
df.to_excel(writer, sheet_name = 'def')
writer.save()
book.close()
It saves the random numbers in the sheet with the name 'def', but the first sheet 'abc' now becomes empty.
What goes wrong here? Hopefully somebody can help me with this.
Interesting question! With openpyxl you can easily add values, keep the formulas but cannot retain the graphs. Also with the latest version (2.5.4), graphs do not stay. So, I decided to address the issue with
xlwings :
import xlwings as xw
wb = xw.Book(r"filename_of_file_with_pictures_in_it.xlsx")
sht=wb.sheets.add('SheetMod')
sht.range('A1').value = np.random.randn(100, 2)
wb.save(r"path_new_file.xlsx")
With this snippet I managed to insert the random set of values and saved a new copy of the modified xlsx.As you insert the command, the excel file will automatically open showing you the new sheet- without changing the existing ones (graphs and formulas included). Make sure you install all the interdependencies to get xlwings to run in your system. Hope this helps!
You'll need to use an Excel 'reader' like Openpyxl or similar in combnination with Pandas for this, pandas' to_excel function is write only so it will not care what is inside the file when you open it.
The following piece of code is getting the data from Excel in the 5th row and the 14th row:
import pandas as pd
import pymssql
df=[]
fp = "G:\\Data\\Hotels\\ABZPD - Daily Strategy Tool.xlsm"
data = pd.read_excel(fp,sheet_name ="CRM View" )
row_date = data.loc[2, :]
row_sita = "ABZPD"
row_event = data.iloc[11, :]
df = pd.DataFrame({'date': row_date,
'sita': row_sita,
'event': row_event
})
print(df)
However, it is not actually using the worksheet I need it to. Instead of using "CRM View" (like I told it to!) it is using the worksheet "Previous CRM View". I assume this is because both worksheets have similar names.
So the question is, how do I get it to use the one that is called "CRM View"?
I was able to reproduce your problem. It didn't seem like it was about that the supplied sheet name is similar, it just read the first sheet in the file no matter what you put sheet_name to.
Anyway, It seemed like a bug so I checked what version of pandas I was running, which was 0.20.3. After updating to 0.22.0 the problem was gone and the right sheet was selected.
Edit: this was apparently a known bug in 0.20.3.
Is there a way to have pandas read in only the values from excel and not the formulas? It reads the formulas in as NaN unless I go in and manually save the excel file before running the code. I am just working with the basic read excel function of pandas,
import pandas as pd
df = pd.read_excel(filename, sheetname="Sheet1")
This will read the values if I have gone in and saved the file prior to running the code. But after running the code to update a new sheet, if I don't go in and save the file after doing that and try to run this again, it will read the formulas as NaN instead of just the values. Is there a work around that anyone knows of that will just read values from excel with pandas?
That is strange. The normal behaviour of pandas is read values, not formulas. Likely, the problem is in your excel files. Probably your formulas point to other files, or they return a value that pandas sees as nan.
In the first case, the sheet needs to be updated and there is nothing pandas can do about that (but read on).
In the second case, you could solve by setting explicit nan values in read_excel:
pd.read_excel(path, sheetname="Sheet1", na_values = [your na identifiers])
As for the first case, and as a workaround solution to make your work easier, you can automate what you are doing by hand using xlwings:
import pandas as pd
import xlwings as xl
def df_from_excel(path):
app = xl.App(visible=False)
book = app.books.open(path)
book.save()
app.kill()
return pd.read_excel(path)
df = df_from_excel(path to your file)
If you want to keep those formulas in your excel file just save the file in a different location (book.save(different location)). Then you can get rid of the temporary files with shutil.
I had this problem and I resolve it by moving a graph below the first row I was reading. Looks like the position of the graphs may cause problems.
you can use xlrd to read the values.
first you should refresh your excel sheet you are also updating the values automatically with python. you can use the function below
file = myxl.xls
import xlrd
import win32com.client
import os
def refresh_file(file):
xlapp = win32com.client.DispatchEx("Excel.Application")
path = os.path.abspath(file)
wb = xlapp.Wordbooks.Open(path)
wb.RefreshAll()
xlapp.CalculateUntilAsyncqueriesDone()
wb.save()
xlapp.Quit()
after the file refresh, you can start reading the content.
workbook = xlrd.open_workbook(file)
worksheet = workbook.sheet_by_index(0)
for rowid in range(worksheet.nrows):
row = worksheet.row(rowid)
for colid, cell in enumerate(row):
print(cell.value)
you can loop through however you need the data. and put conditions while you are reading the data. lot more flexibility