xlsx to csv without formulas in python, openpyxl or win32com - python

Is there way i can save an xlsx as csv and also i will need to remove formulas.
Edit-->My excel column B "price" is updated via webservice addin every 10 secs (stock prices).somehow if how if i save file using openpyxl using the option dataonly=true, i am not getting the most recent price instead it is gettiing some old values (getting the value stored the last time Excel read the sheet)
Orginal File
A B
StockId Price
13i 16.1353
14i 15.4252 --> formuala = RTD(A3,"Last", "HSC","xxx")
New File Created using opepyxl (data only true)-formula removed but price is not most recent
A B
StockId Price
13i 15.1353
14i 15.3252
Instead of using openpyxl, if i use win32 com to read the excel file, out file is still keeping the formula. is there anyway i can remove the formula.
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
wb = xl.Workbooks.Open(r"C:\Code\test.xlsx")
ws = xl.ActiveSheet
wb.SaveAs(r"C:\Code\test.csv")
wb.Close()
xl.Quit()

data_only=True applies only to reading files with openpyxl: the option is meaningless for writing files.

Related

openpyxl is exporting excel file with error

I have an excel file where the first four rows contain some header text and the actual dataset starts from row 4. I am trying to build a simple function that reads the excel file and outputs the same excel file after deleting the first 4 rows.
This is what my code looks like before I put it into a function.
import pandas as pd
from openpyxl import load_workbook, Workbook
wb = load_workbook('FILEPATH/excel.xlsx')
ws = wb['Sheet1']
ws = ws.delete_rows(0,4)
wb.save(r"FILEPATH/deleted_row.xlsx")
When I run the code it executes the file properly but when I try to open the excel file it give me errors and says that the file is corrupted. A point to note is that the excel file has some formatting on the rop rows. Is that what is causing some issues?
Any help is appreciated.
EDIT: This is what the errors look like and the file does not open.
In openpyxl, the first row should be 1, not 0. So, if you are looking to delete the first 4 rows, you should change the delete_row() from
ws = ws.delete_rows(0,4)
to
ws = ws.delete_rows(1,4)

Getting cell values from Macros excel call to another sheet with any python library

The issue I am having is I have a .xlsm workbook with two worksheets.
One of the worksheets using a VLOOKUP macro function in a cell, that looks up a value in the second sheet in the workbook. I just need the date value that it defines from the VLOOKUP.
What I have tried:
-I used Openpyxl to open the existing workbook, using data_only=True,
vba_values=True and the value keeps giving me '#N/A'
-I have tried using win32com to open the workbook, refresh the workbook and grab
the cell value, but I get this giant negative int -217589383
I am not sure if this is possible in openpyxl or if I am not using the library correctly.
The macro in the cell looks like this '=VLOOKUP(A34,SSU!$1:$65536,2,FALSE)', the second sheet is called SSU.
I don't care which Python library I use in order to get the value that this macro calls, so long as I can get it. When I have a file that is .xls I am able to get that value easily using xlrd, but unfortunately, xlrd doesn't work with files that aren't .xls.
Below is my code sample.
elif check_file_type(site_list_name, ['.xlsx', '.xlsm', '.xltx', '.xltm']) and get_file_name(
site_list_name) != prefix_suffix_file_name:
workbook = load_workbook(site_list_name_path, keep_vba=True, data_only=True)
worksheet = workbook.active
print(worksheet['B4'].value)
the print value is #N/A
Any suggestions would be greatly appreciated!!

how to handle excel file (xlsx,xls) with excel formulas(macros) in python

I need to pass inputs from Input_data.xls in iteration to existing xls file which have special function at various cells using python3.6. These function change primary data in existing xls as per inputs. But when xlrd open the file it doesn't import the xls cell function and save file file with modification. And write object name instead of its value
Python code:
import xlrd
import xlwt
import xlutils
from xlrd import open_workbook
from xlutils.copy import copy
import os.path
book = xlrd.open_workbook('input_data.xlsx')
sheet0 = book.sheet_by_index(0)
for i in range (sheet0.nrows):
st1=sheet0.row_values(i+1)
TIP=[st1[0]]
OOIPAN_IP=[st1[1]]
NM=[st1[2]]
book1 = xlrd.open_workbook('primary_data.xls')
wb=copy(book1)
w_sheet=wb.get_sheet(0)
w_sheet.write(1,0,'TIP')
w_sheet.write(1,1,'OIP')
w_sheet.write(1,2,'NM')
wb.save('ipsectemp.xls')
write object name in cells instead of object's vlaue
input 1 input 2 input 3
st1[0] st1[1] st1[2]
which module can help to open/read/write workbook with its excel functions (macros) in python.
Luckly, i found below code that can fetch excel macros, openpyxl module does good work using cell values
book = load_workbook('primary_data.xlsx') #open ipsec file with desired inputs
sheet0 = book.get_sheet_by_name('Sheet1')
for row in range(2,sheet0.max_row+1):
for column in "A": #Here add or reduce the columns
cell_name = "{}{}".format(column, row)
textlt=sheet0[cell_name].value
print(textlt)
information extracted from this answer
openpyxl - read only one column from excel file in python? used information other way

pandas read excel values not formulas

Is there a way to have pandas read in only the values from excel and not the formulas? It reads the formulas in as NaN unless I go in and manually save the excel file before running the code. I am just working with the basic read excel function of pandas,
import pandas as pd
df = pd.read_excel(filename, sheetname="Sheet1")
This will read the values if I have gone in and saved the file prior to running the code. But after running the code to update a new sheet, if I don't go in and save the file after doing that and try to run this again, it will read the formulas as NaN instead of just the values. Is there a work around that anyone knows of that will just read values from excel with pandas?
That is strange. The normal behaviour of pandas is read values, not formulas. Likely, the problem is in your excel files. Probably your formulas point to other files, or they return a value that pandas sees as nan.
In the first case, the sheet needs to be updated and there is nothing pandas can do about that (but read on).
In the second case, you could solve by setting explicit nan values in read_excel:
pd.read_excel(path, sheetname="Sheet1", na_values = [your na identifiers])
As for the first case, and as a workaround solution to make your work easier, you can automate what you are doing by hand using xlwings:
import pandas as pd
import xlwings as xl
def df_from_excel(path):
app = xl.App(visible=False)
book = app.books.open(path)
book.save()
app.kill()
return pd.read_excel(path)
df = df_from_excel(path to your file)
If you want to keep those formulas in your excel file just save the file in a different location (book.save(different location)). Then you can get rid of the temporary files with shutil.
I had this problem and I resolve it by moving a graph below the first row I was reading. Looks like the position of the graphs may cause problems.
you can use xlrd to read the values.
first you should refresh your excel sheet you are also updating the values automatically with python. you can use the function below
file = myxl.xls
import xlrd
import win32com.client
import os
def refresh_file(file):
xlapp = win32com.client.DispatchEx("Excel.Application")
path = os.path.abspath(file)
wb = xlapp.Wordbooks.Open(path)
wb.RefreshAll()
xlapp.CalculateUntilAsyncqueriesDone()
wb.save()
xlapp.Quit()
after the file refresh, you can start reading the content.
workbook = xlrd.open_workbook(file)
worksheet = workbook.sheet_by_index(0)
for rowid in range(worksheet.nrows):
row = worksheet.row(rowid)
for colid, cell in enumerate(row):
print(cell.value)
you can loop through however you need the data. and put conditions while you are reading the data. lot more flexibility

Python append xls file using only xlwt/xlrd

I am having problems appending issues appending data to an xls file.
Long story short, I am using a program to get some data from something and writing it in an xls file.
If I run the script 10 times, I would like the results to be appended to the same xls file.
My problem is that I am forced to use Python 3.4 and xlutils is not supported, so I cannot use the copy function.
I just have to use xlwt / xlrd. Note, the file cannot be a xlsx.
Is there any way i can do this?
I would look into using openpyxl, which is supported by Python 3.4. An example of appending to a file can be found https://openpyxl.readthedocs.org/en/default/. Please also see: How to append to an existing excel sheet with XLWT in Python. Here is an example that will do it. Assuming you have an Excel sheet called sample.xlsx:
from openpyxl import Workbook, load_workbook
# grab the active worksheet
wb = load_workbook("sample.xlsx")
ws = wb.active
ws.append([3])
# Save the file
wb.save("sample.xlsx")

Categories