Write data with Python into existing excel file keeping it intact as much as possible - python

We have a rather complicated Excel based VBA Tool that shall be replaced by a proper Database and Python based application step by step.
There will be time of the transition between were the not yet completely ready Python tool and the already existing VBA solution will coexist.
To allow interoperability the Python tool must be able to export the database values into the Excel VBA Tool keeping it intact. Meaning that not only all VBA codes have to work as expected but also Shapes, Special Formats etc, Checkboxes etc. have to work after the export.
Currently a simple:
from openpyxl import load_workbook
wb = load_workbook(r'Tool.xlsm', keep_vba=True)
# Write some data i.e. (not required to destroy the file)
wb["SomeSheet!SomeCell"] = "SomeValue"
wb.save(r"Tool_filled.xlsm")
will destroy the file, i.e. shapes won't work, checkboxes neither. (The resulting file is only 5 MB from originally 8 MB, showing that something went quite wrong).
Is there a way to only modify only the data of an ExcelSheet keeping everything else intact/untouched?
As far I know an Excel Sheet are only zipped .xml files. So it should be possible to edit only the related sheets? Correct?
Is there a more comfortable way as writing everything from scratch to only modify the data of an existing Excel file?
Note: The solution has to work in Linux, so simple remote Excel calls are not an option.

Related

How to programmatically copy excel worksheet sheet to a new one

I have created a python script that parses some data and uses it to generate a few worksheet excel files (xlsx) (I am not much knowledgeable in excel by the way).
The original worksheet i am reading the data from has a main sheet used to fill in lots of info which is then distributed across many other sheets, which perform some formulas to calculate some results.
My python script does the processing i want by automatically filling in this main sheet, which then produces the results of the other sheets. I then save the worksheet and eveything looks fine.
Now, I want to split those worksheets into individual ones without including the main sheet in any of them. This appears to be pretty challenging.
I first tried using the data_only argument of load_workbook, but I quickly discovered that in order to preserve the values and not the formulas in the spreadsheet(or just get None back), I'd have to manually open and save each one of the files so that a temporary cache is created. That won't really do it in my case since i want the whole thing to be automated.
Among other things, the one that came closer is this piece of code:
workbook = xw.Book('generatedFiles/generated.xlsx')
sheet1 = workbook.sheets['PRIM'].used_range.value
df = pd.DataFrame(sheet1)
df.to_excel('generatedFiles/fixed-generated.xlsx', index = False, header = False)
This does indeed generate a spreadsheet with the values, using its pandas dataframe, but the problem is that it doesn't preserve any information about the types of the values. So for example, an integer is being treated as a string when saved.
This messes the processing being done by an external parser that I feed these files with.
Any ideas on how to fix that ? What would be the best way to go about doing something like that ?
Thanks !

Automatic input from text file in excel

My problem is rather simple : I have an Excel Sheet that does calculations and creates a graph based on the values of two cells in the sheet. I also have two lists of inputs in text files. I would like to loop through those text files, add the values to the excel sheet, refresh the sheet, and print the resulting graph to a pdf file or an excel file named something like 'input1 - input2.xlsx'.
My programming knowledge is limited, I am decent with Python and have looked into python libraries that work with excel such as openpyxl, however most of those don't seem to work for me for various reasons. Openpyxl deletes the graphs when opening an excel file; XlsxWriter can only write files, not read from them; and xlwings won't work for me.
Should I use python, which I'm familiar with, or would VBA work for this kind of problem? Have any of you ever done something of the sort?
Thanks in advance
As a more transitional approach to what m. wasowski wrote above, I'd suggest you do the following.
Install the pandas package, and see how easy it is to load a file using read_excel. Then, read 10 Minutes to Pandas, and manipulate the data.
You state that the Excel sheet is complex. In general, the more complex it is, this approach will eventually make it simpler. But you don't have to switch everything immediately. You can still do parts in Excel and parts in pandas.
I think you should consider win32Com for excel operation in python instead of Openpyxl,XlsxWriter.
you can read/write excel, create chart and format excel file using win32com without any limitation.
And creating chart you can consider matplotlib, in that after creating chart you can save it in pdf file also.

How do I write to one sheet in an already existing excel sheet in Python?

I got an excel file that has four sheets. One sheet, sheet 4. contains data in simple CSV and the others read the data of this sheet and make different calculations and graphs. In my python application I would like to open the excel file, open sheet 4, and replace the data. I know you technically can't open and edit excel however you like with Python, due to the complex file structure of XLS (previous relevant answer), but is there a work around for this specific case? Remember the only thing I want to do is to open the data sheet, write to it, and ignore the others...
Note: Previous answers to relevant questions have suggested using the copy function in xlutils. But that doesn't work in this case, as the rest of the sheets are rather complex. The graphs, for example, can't be preserved with the copy function.
I used to use pyExcelerator. It did certainly a good job, but I'm not sure if it is maintained.
https://pypi.python.org/pypi/pyExcelerator/
hth.

Modifying, recalculating and and extracting results from Excel in Python

I would like to try and make a program which does the following, preferably in Python, but if it needs to be C# or other, that's OK.
Writes data to an excel spreadsheet
Makes Excel recalculate formulas etc. in the modified spreadsheet
Extracts the results back out
I have used things like openpyxl before, but this obviously can't do step 2. Is the a way of recalculating the spreadsheet without opening it in Excel? Any thoughts greatly appreciated.
Thanks,
Jack
You need some sort of UI automation with which you can control a UI application such as Excel. Excel probably exposes some COM interface that you should be able to use for what you need. Python has the PyWin32 library which you should install, after which you'll have the win32com module available.
See also:
Excel Python API
Automation Excel from Python
If you don't necessarily have to work with Excel specifically and just need to do spreadsheet using Python, you might want to look at http://manns.github.io/pyspread/.
you could you pandas for reading in the data, using python to recalculate and then write the new files.
For pandas it's something like:
#Import Excel file
xls = pd.ExcelFile('path_to_file' + '/' + 'file.xlsx')
xls.parse('nyc-condominium-dataset', index_col='property_id', na_values=['NA'])
so not difficult. Here the link to pandas.
Have fun!

Module that can help control excel from within python application anything more needed than pywin32?

I have a viewer I built using WXPython. The viewer is basically a browser (built on IE wx.lib.iewin) that loads the txt or htm files I have in a directory and then lets me move through the files sequentially. Instead of having to go to the directory to select the next file to view the viewer/browser has a next button that loads the next file in the queue.
I want to be able to add a new feature that allows me to highlight some text that is visible in the browser and then push a button and have that text passed into a cell in excel.
Lots of things are going to have to happen like I need to be able to find and start a new instance of excel. I need to be able to add a new worksheet and pass some values to populate cells on the worksheet based on the file I am looking at and then if I want to collect some data from the file I want to be able to highlight the data in the viewer and then press a button on the viewer and have the data passed to excel.
I think I am going to start with PyWin32 but I am wondering if there is something else I need but I don't know enough to look for it.
If someone knows of an example where text was piped from a Python application to excel under the users control I would appreciate a pointer in that direction. It is easy enough I think to do this going from the application to a file that gets created (but not displayed) but I am hoping to go from the browser to the excel file so that the user can evaluate their work in progress.
I'd recomend using one of the python excel modules like python-excel.
It works on any OS, and without any other Excel application installed.
http://www.python-excel.org/
Code would look somehting liek this to write to a new xls document, slightly different to open an existing.
import xlwt
wbk = xlwt.Workbook()
sheet = wbk.add_sheet('sheet 1')
# indexing is zero based, row then column
sheet.write(0,1,'test text')
wbk.save('test.xls')
Hopefully this can get you on the right path, then you'll be able to post more specific questions if you run into problems.
Note: Another option is openpyxl:
http://packages.python.org/openpyxl/tutorial.html
If you're using wxPython (which I assume you are due to the tag), you should look at XLSGrid: http://www.blog.pythonlibrary.org/2011/08/20/wxpython-new-widget-announced-xlsgrid/
If you just want to work with Excel, I would recommend xlwt or xlrd, although you can use PyWin32 to work with it too via COM: http://www.blog.pythonlibrary.org/2010/07/16/python-and-microsoft-office-using-pywin32/

Categories