My point is that using either pod (from appy framework, which is a pain to use for me) or the OpenOffice UNO bridge that seems soon to be deprecated, and that requires OOo.org to run while launching my script is not satisfactory at all.
Can anyone point me to a neat way to produce a simple yet clean ODT (tables are my priority) without having to code it myself all over again ?
edit: I'm giving a try to ODFpy that seems to do what I need, more on that later.
Your mileage with odfpy may vary. I didn't like it - I ended up using a template ODT, created in OpenOffice, oppening the contents.xml with ziplib and elementtree, and updating that. (In your case, it would create only the relevant table rows and table cell nodes), then recorded everything back.
It is actually straightforward, but for making ElementTree properly work with the XML namespaces. (it is badly documente) But it can be done. I don't have the example, sorry.
To edit odt files, my answer may not help, but if you want to create new odt files, you can use QTextDocument, QTextCursor and QTextDocumentWriter in PyQt4. A simple example to show how to write to an odt file:
>>>from pyqt4 import QtGui
# Create a document object
>>>doc = QtGui.QTextDocument()
# Create a cursor pointing to the beginning of the document
>>>cursor = QtGui.QTextCursor(doc)
# Insert some text
>>>cursor.insertText('Hello world')
# Create a writer to save the document
>>>writer = QtGui.QTextDocumentWriter()
>>>writer.supportedDocumentFormats()
[PyQt4.QtCore.QByteArray(b'HTML'), PyQt4.QtCore.QByteArray(b'ODF'), PyQt4.QtCore.QByteArray(b'plaintext')]
>>>odf_format = writer.supportedDocumentFormats()[1]
>>>writer.setFormat(odf_format)
>>>writer.setFileName('hello_world.odt')
>>>writer.write(doc) # Return True if successful
True
QTextCursor also can insert tables, frames, blocks, images. More information. More information at:
http://qt-project.org/doc/qt-4.8/qtextcursor.html
As a bonus, you also can print to a pdf file by using QPrinter.
Related
I have a very specific use case where I need to insert tab stops into Word documents. My code works perfectly when using a docx that was created normally. However, the other part of my use case is that I extract the html from a text editor and turn it into a docx. The problem is with these documents that were generated from html, for some reason when running the same code to insert tab stops it does not work. The tab stop configuration gets created but it is not applied to the document. I cannot seem to find a way around it and any help would be deeply appreciated.
Below is a code sample:
from docx import Document
from docx.shared import Inches
from docx.enum.text import WD_TAB_ALIGNMENT, WD_TAB_LEADER
from htmldocx import HtmlToDocx
new_parser = HtmlToDocx()
new_html = """<p><span>some text</span></p>
<br>
<p><span>Some persons name</span></p>
<p><span>Another text</span></p>
<p><span>Some date</span></p>"""
document = Document()
new_parser.add_html_to_document(new_html, document)
for para in document.paragraphs:
tab_stops = para.paragraph_format.tab_stops
tab_stops.add_tab_stop(Inches(5.51),
WD_TAB_ALIGNMENT.RIGHT, WD_TAB_LEADER.DOTS)
document.save('new-file-name.docx')
When running this code the tab stops configuration gets created correctly in the docx, but it is not reflected in the document itself. Below you can see the configuration correctly created:
However, those tab stops are not visible in the document itself.
This function is supposed to run on Azure functions, so pywin32 is not an option to convert html to docx as it does not run on linux.
I have tried manually setting the styles of the document. I have tried using the api of convertapi, as well as using the library aspose.words but nothing seems to work. It seems that there is something about converting html to docx that precludes inserting tab stops.
Thank you very much in advance and any help is deeply appreciated.
I'm working in doing some work with python and excel. In this case I have to modify an .xlsx document and then save the document. But the issue is, the original document have especial format and style. I need to preserve the format after the work. This is some code I'm using.
import openpyxl as xl
*#open the file*
wb = xl.load_workbook("CR_Accounts_Dashboard_V4_20170127.xlsx")
*#...
#...
# Do some stuff
#...
#...*
*#save the file*
wb.save("CR_Accounts_Dashboard_V4_20170127.xls")
So after saving the file the original format and style are been removed.
This is one sheet in the original file
After working in the file and saving it
Here we have another example
This is another sheet in the original file
After working in the file and saving it
Does anyone have any idea about preserving the format and style?
Could not reproduce your problem
Give us your used Version of openpyxl
Try the following, without doing annything else, and verify if the problem persists:
wb = xl.load_workbook("CR_Accounts_Dashboard_V4_20170127.xlsx")
wb.save("CR_Accounts_Dashboard_V4_20170127.xls")
Tested with Python:3.4.2 - openpyxl:2.4.1 - LibreOffice: 4.3.3.2
It says quite clearly in the documentation that charts in existing files are not preserved
For the rest it looks like you may be using tables or even pivot tables. Support for worksheet tables with be in version 2.4.4 (you'll need to use a checkout until it's released) but pivot tables are not supported: it's a lot of work and so far no one has been prepared to sponsor the development.
I am often writing scripts for various 3d packages (3ds max, Maya, etc) and that is why I am interested with Alembic, a file format that is getting a lot of attention lately.
Quick explanation for anyone who does not know this project: alembic - www.alembic.io - is a file format created for containing 3d meshes and data connected with them. It is using a tree-like structure, as You may see below, with one root node and its childs, childs of childs etc. Objects of this node can have properties.
I am trying to learn how to use this Alembic with Python.
There are some tutorials on docks page of this project and I'm having some problems with this one:
http://docs.alembic.io/python/cask.html
It's about using cask module - a wrapper that should manipulating a content of files easier.
This part:
a = cask.Archive("animatedcube.abc")
r = cask.Xform()
x = a.top.children["cube1"]
a.top.children["root"] = r
r.children["cube1"] = x
a.write_to_file("/var/tmp/cask_insert_node.abc")
works well. Afther that there's new file "cask_insert_node.abc" and it has objects as expected.
But when I'm adding some properties to objects, like this:
a = cask.Archive("animatedcube.abc")
r = cask.Xform()
x = a.top.children["cube1"]
x.properties['new_property'] = cask.Property()
a.top.children["root"] = r
r.children["cube1"] = x
a.write_to_file("/var/tmp/cask_insert_node.abc")
the "cube1" object in a resulting file do not contain property "new_property".
The saving process is a problem, i know that the property has been added to "cube1" before saving, I've checked it another way, with a function that I wrote which creates graph of objects in archive.
The code for this module is there:
source
Does anyone know what I am doing wrong? How to save parameters? Some other way?
Sadly, cask doesn't support this. One cannot modify an archive and have the result saved (somehow related to how Alembic streams the data off of disk). What you'll want to do is create an output archive
oArchive = alembic.Abc.CreateArchiveWithInfo(...)
then copy all desired data from your input archive over to your output archive including time sampling (
.addTimeSampling(iArchive.getTimeSampling(i) for i in iArchive.getNumTimeSamplings()
, and objects, recursing through iArchive.getTop() and oArchive.getTop() defining output properties (alembic.Abc.OArrayProperty, or OScalarProperty) as you encounter them in the iArchive. When these are defined, you can interject your new values as samples to the property at that time.
It's a real beast, and something that cask really ought to support. In fact, someone in the Alembic community should just do everyone a favor and write a cask2 (casket?) which wraps all of this into simple calls like you instinctively tried to do.
I am trying to update a FITS file with a new column of data. My file has a Primary HDU, and two other HDUs, each one including a table.
Since adding a new column to the table of an already existing FITS file is a pain (unsolvable, see here and here), I changed my mind and try to focus on creating a new file with a modified table.
This means I have to copy all the rest from the original file (Primary HDU, other HDUs, etc.). Is there a standard way to do this? Or, what is the best (fastest?) way, possibly avoiding to copy each element one by one "by hand"?
On the topic of adding a new columns, have you seen this documentation? This is the most straightforward way to create a new table with the new column added. This necessarily involves creating a new binary table HDU since it describes different data.
Or have you looked into the Astropy table interface? It supports reading and writing FITS tables. See here. It basically works the same way but goes to some more efforts to the hide the details. This is the interface the PyFITS/astropy.io.fits interface is gradually being replaced with since it actually provides a good table interface.
Adding a new HDU or replacing an existing HDU in an existing FITS file is simply a matter of opening that file and updating the HDUList data structure (which works like a normal Python list) and writing the updated HDUList to a new file.
A full example might look something like:
try:
from astropy.io import fits
except ImportError:
import pyfits as fits
with fits.open('path/to/file.fits') as hdul:
table_hdu = hdul[1] # If the table is the first extension HDU
new_column = fits.Column(name='NEWCOL', format='D', array=np.zeros(len(table_hdu.data)))
new_columns = fits.ColDefs([new_column])
new_table_hdu = fits.new_table(table_hdu.columns + new_columns)
# Replace the original table HDU with the new one
hdul[1] = new_table_hdu
hdul.writeto('path/to/new_file.fits')
Something roughly like that should work. This will be easier in Astropy once the new Table interface is fully integrated but for now that's what it involves. There is no reason to do anything "by hand" so to speak.
I want to write to an Excel sheet via pywin32. I can do it actually without problem. But I couldnt format a range of cells in sheet. I want to align the values centerly inside cells. And also i need to fill the cells with color. How can I do it?
Thanks in advance.
I've not specifically done this using python before, but I'm assuming you're using the COM automation interface to excel.
This page has an example that seems to cover both alignment and filling cells with colour in C#, so it should be fairly easy to adapt to python. Assuming you have a Worksheet object called sheet, and the Excel automation object is called Excel, I'm guessing it might look a bit like this:
//Format A1:D1 as center alignment,
sheet.Range("A1", "D1").VerticalAlignment = Excel.XlVAlign.xlVAlignCenter
sheet.Range("A1", "D1").HorizontalAlignment = Excel.XlHAlign.xlHAlignCenter
sheet.Range("A1", "D1").Interior.ColorIndex = Excel.XlColorIndex.Red
If you don't have access to the Excel.XlAlign and XlColorIndex constants from python then you can just replace them with the specific integers they represent, though I'm not entirey sure where you could get them from. Probably from a VBA Reference Site or similar. (Though that link I provided doesn't seem to allow you to expand each of the entries in the list, so you may need to look elsewhere)
EDIT: Just had a play about with excel automation via the python console, and it seems to work alright:
>>> from win32com.client import Dispatch
>>> xlApp = Dispatch("Excel.Application")
>>> xlWb = xlApp.Workbooks.Add()
>>> xlSht = xlWb.WorkSheets(1)
>>> xlSht.Range("A1", "D1").VerticalAlignment = 1
>>> xlSht.Range("A1", "D1").Interior.ColorIndex = 6
>>> # The background color of A1-D1 should now be yellow
>>> xlSht.Cells(1, 1).VerticalAlignment = 1
If you can't find any good reference on what the various alignment/colour constants are, then I'd just play about with python on the console like this, then open the resulting worksheet in excel and have a look at the results to figure things out.
You can find the official reference for the office 2003 automation API here
Specifically, you'll probably find the range documentation most usefull.