I'm currently working on this project: https://github.com/lucasmolinari/unlocker-EX.
It's a excel unlocker, it works by editing the XML files inside the workbooks. (more information on the github page).
The script works fine in workbooks with almost no content inside, but recently I'm testing some bigger workbooks, and when I open the unlocked file, excel says it's corrupted and I can't find any difference between the original and the unlocked workbook, I'm 100% sure the problem is when the script change the content in the file, I watched every step of the script and it just stops working when the files are edited.
Does someone have more knowlege on how XML files work or in the structure of excel workbooks? Or like, some way to verify the differences between the original file and the edited to see if is some formatting problem..? I'm really sorry about this question, but I have no idea from where to start now, I tried everything I can.
Changed to open files in UTF-8 format and tried to find any corrupted character in the edited file,but manually is too hard to find any.
Using ElementTree library solves the problem
Related
I am trying to create a xlsx from a template exported from Microsoft dynamics NAV, so I can upload my file to the system.
I am able to recreate and fill the template using the library xlsxwriter, but unfortunately I have figured out that the template file also have an attached XML source code file(visible in the developer tab in Excel).
I can easily modify the XML file to match what I want, but I can't seem to find a way to add the XML source code to the xlsx file.
I have searched for "python adding xlsx xml source" but it doesn't seem to give me anything I can use.
Any help would be greatly appreciated.
Best regards
Martin
Xlsx file is basically a zip archive. Open it as archive and you'll probably be able to find the XML file and modify it. –
Mak Sim
yesterday
We have a rather complicated Excel based VBA Tool that shall be replaced by a proper Database and Python based application step by step.
There will be time of the transition between were the not yet completely ready Python tool and the already existing VBA solution will coexist.
To allow interoperability the Python tool must be able to export the database values into the Excel VBA Tool keeping it intact. Meaning that not only all VBA codes have to work as expected but also Shapes, Special Formats etc, Checkboxes etc. have to work after the export.
Currently a simple:
from openpyxl import load_workbook
wb = load_workbook(r'Tool.xlsm', keep_vba=True)
# Write some data i.e. (not required to destroy the file)
wb["SomeSheet!SomeCell"] = "SomeValue"
wb.save(r"Tool_filled.xlsm")
will destroy the file, i.e. shapes won't work, checkboxes neither. (The resulting file is only 5 MB from originally 8 MB, showing that something went quite wrong).
Is there a way to only modify only the data of an ExcelSheet keeping everything else intact/untouched?
As far I know an Excel Sheet are only zipped .xml files. So it should be possible to edit only the related sheets? Correct?
Is there a more comfortable way as writing everything from scratch to only modify the data of an existing Excel file?
Note: The solution has to work in Linux, so simple remote Excel calls are not an option.
I want to be able to open an Excel document and start manipulating the data without seeing any pop-ups.
I think the pop-ups are the ones stopping my Excel file from opening successfully. Here are the pop-ups I am seeing at excel and I would like to automatically answer them instead of doing it manually. I found some answers online but not for my case.
The file format and extension of "xxx" don't match. The file could be
corrupted or unsafe. Unless you trust its source, don't open it. Do
you want to open it anyway?
option1: Yes , option2: No , option3: Help
or
Open XML Please select how you would like to open this file:
As an XML table
As a read-only workbook
use the XML Source task pane
or
XML Import Error
ok
help
After I select: Yes & As an XML table & ok, everything works perfectly. If anyone could help me out I would much appreciate it.
My lab has a very large directory of Sigmaplot files, saved as .JNB . I would like to process the data in these files using Python. However, I have thus far been unable to read the files into anything interpretable.
I've already tried pretty much every numpy read function and most the panda read functions, and am getting nothing but gibberish.
Does anyone have any advice about reading these files short of exporting them all to excel one by one?
Ok im using this git from Git Bash. After i run it i have the txt files of the Securities and Exchange Commission DB which is EDGAR in this format on my hard drive. I am using Win 7. The txt files have HTML tags inside.
I was wondering since the files in text are in this strict format by the SEC agency since the early nineties if there is a way to extract a certain item let's say
<us-gaap:IncomeTaxExpenseBenefit contextRef="eol_PE9523----1310-K0013_STD_365_20131231_0"
decimals="-3" id="id_3914012_7F3BEF88-8CD1-49E7-8A78-91A091178D1B_1_13"
unitRef="iso4217_USD">40315000</us-gaap:IncomeTaxExpenseBenefit>
Whether by using a Script or a git repository with accuracy since the format is strict? How for instance can someone extract a hole table from the txt file? Libraries, gits, scripts anything that with a little work and modification can be picked up will be fine for me to have a start.
Can any of these gits get in and do such a job? I read the instructions (whenever there are) but i dont understand many stuff.
It's not HTML. It looks like XML - try using an XML parser for Python, for example ElementTree, and parsing out the relevant information. The tutorial is included on the their page.