How can I change the image layout options on python.docx? - python

I need to creat a archive with the same pattern that other one, but i need to do this with python. In this another archive i have a image with this following configuration the image and text are alligned, but when i try to put my image with the following code
p = doc.add_paragraph()
r = p.add_run()
r.add_picture(myPath)
r.add_text(myText)
the image stays alligned just to the first line of the text, like in this image allinged just with the first line.
I see that if i go into the word and change the layout options to this With Text Wrapping, the second option everything work exactaly as i want to. But how can I chage this layout options using python?

There is no API support for floating images in python-docx, which seems to be what you are asking about. Run.add_picture() adds a so-called "inline" picture (shape) which is treated like a single glyph (character in the run). So the line height grows to the height of the image and only that single line can abut the image.
One alternative would be to use a table with two cells, place the image in one cell and the text in the other.

Related

How to find objects in floor plan image in Tkinter python through svg file?

I have a vectorized floorplan image. I want to identify the objects in the image through the vector data in the SVG file of that image. The SVG code does not have any close points(z) in between them. So I am unable to understand when does the point moves to the other object? Can somebody help me, please?
I have very little knowledge about these SVG files and using them in Tkinter. So please somebody help me or suggest me what can I do?
This is the vector data of the image.
vector data of the image
use in conjunction with SO floorplan question.
Jump to z_final_floorplan.svg for final file.
A
Create 4 files:
w_original_floorplan.svg
x_rough_static_floorplan.svg
y_rough_live_floorplan.svg
z_final_floorplan.svg
w_original_floorplan.svg and x_rough_static_floorplan.svg are identical apart from filename.
y_rough_live_floorplan.svg and z_final_floorplan.svg are empty; to be populated.
Copy x_rough_static_floorplan.svg to y_rough_live_floorplan.svg.
Open y_rough_live_floorplan.svg on browser using server.
x_rough_static_floorplan.svg find all M and replace with two newlines / symbol M (case sensitive). shift + enter shift + enter /M
B
[this section takes the time]
Take away 1st '/' in path in y_rough_live_floorplan.svg [shows blackout_floorplan]
Label x_rough_static_floorplan.svg code section blackout_floorplan where code is.
(this file is used as rough-work, so being xml / svg valid is irrelevant)
In y_rough_live_floorplan.svg find next '/' and delete it [shows floorplan_top_left_whiteout]
Label x_rough_static_floorplan.svg code section floorplan_top_left_whiteout where code is.
Have x_rough_static_floorplan.svg and y_rough_live_floorplan.svg open in 2 windows, will be going back and forth to each of them. Keep repeating until at end.
(hint: find tool seems to be on switching from files in vscode, so you can use find / and next one cmd + g easily) Maybe handy to have a paper printout of original svg as reference and label the names of objects you create e.g.bath, sink, table, as you go along (don’t be fooled by this, one table is 'table'. Is 2nd chair chair2, chair_2, chair_two etc.?) etc..
C
Reorder the whole labels and corresponding code in path x_rough_static_floorplan.svg so the labels are ordered next to each other, but in the order they are found in the path:
e.g.
…
floorplan
bath
sink
table_chairs
sofa
…
Use the 'find' tool here. This process, itself will require a temp file to copy and paste to rather than reorder within the file working on. And rewrite temp to file working on. Might be good idea to create checklist of objects and cross-off as done.
E.g. floorplan, bath, table_chairs, sink…
D
Create path elements from your grouped objects, putting each id as id=“floorplan_main”, id=“bath”, id=“sink” etc.. etc..
Bear in mind, the data of how this is drawn is really, really bad. Really they should be drawn with rect elements for a rectangle when possible and a lot of the path data is very unnecessary, but that’s obviously how the application generates the svg.

Extract image position from .docx file using python-docx

I'm trying to get the image index from the .docx file using python-docx library. I'm able to extract the name of the image, image height and width. But not the index where it is in the word file
import docx
doc = docx.Document(filename)
for s in doc.inline_shapes:
print (s.height.cm,s.width.cm,s._inline.graphic.graphicData.pic.nvPicPr.cNvPr.name)
output
21.228 15.920 IMG_20160910_220903848.jpg
In fact I would like to know if there is any simpler way to get the image name , like s.height.cm fetched me the height in cm. My primary requirement is to get to know where the image is in the document, because I need to extract the image and do some work on it and then again put the image back to the same location
This operation is not directly supported by the API.
However, if you're willing to dig into the internals a bit and use the underlying lxml API it's possible.
The general approach would be to access the ImagePart instance corresponding to the picture you want to inspect and modify, then read and write the ._blob attribute (which holds the image file as bytes).
This specimen XML might be helpful:
http://python-docx.readthedocs.io/en/latest/dev/analysis/features/shapes/picture.html#specimen-xml
From the inline shape containing the picture, you get the <a:blip> element with this:
blip = inline_shape._inline.graphic.graphicData.pic.blipFill.blip
The relationship id (r:id generally, but r:embed in this case) is available at:
rId = blip.embed
Then you can get the image part from the document part
document_part = document.part
image_part = document_part.related_parts[rId]
And then the binary image is available for read and write on ._blob.
If you write a new blob, it will replace the prior image when saved.
You probably want to get it working with a single image and get a feel for it before scaling up to multiple images in a single document.
There might be one or two image characteristics that are cached, so you might not get all the finer points working until you save and reload the file, so just be alert for that.
Not for the faint of heart as you can see, but should work if you want it bad enough and can trace through the code a bit :)
You can also inspect paragraphs with a simple loop, and check which xml contains an image (for example if an xml contains "graphicData"), that is which is an image container (you can do the same with runs):
from docx import Document
image_paragraphs = []
doc = Document(path_to_docx)
for par in doc.paragraphs:
if 'graphicData' in par._p.xml:
image_paragraphs.append(par)
Than you unzip docx file, images are in the "images" folder, and they are in the same order as they will be in the image_paragraphs list. On every paragraph element you have many options how to change it. If you want to extract img process it and than insert it in the same place, than
paragraph.clear()
paragraph.add_run('your description, if needed')
run = paragraph.runs[0]
run.add_picture(path_to_pic, width, height)
So, I've never really written any answers here, but i think this might be the solution to your problem. With this little code you can see the position of your images given all the paragraphs. Hope it helps.
import docx
doc = docx.Document(filename)
paraGr = []
index = []
par = doc.paragraphs
for i in range(len(par)):
paraGr.append(par[i].text)
if 'graphicData' in par[i]._p.xml:
index.append(i)
If you are using Python 3
pip install python-docx
import docx
doc = docx.Document(document_path)
P = []
I = []
par = doc.paragraphs
for i in range(len(par)):
P.append(par[i].text)
if 'graphicData' in par[i]._p.xml:
I.append(i)
print(I)
#returns list of index(Image_Reference)

Digital cursor movement recorded in PDF creation visible in CorelDraw

I am using CairoSVG to turn an SVG into a PDF. I have found a bug where each line of text has a 1 dimensional mark at the end when the PDF is loaded into CorelDraw.
I find this is true even with the simplest and barest of SVGs. I then decided to dig into the python source of CairoSVG to fix the issue at it's root. I have found text.py to be the place where the issue occurs.
https://github.com/Kozea/CairoSVG/blob/master/cairosvg/text.py
Specifically, the for [x, y, dx, dy, r], letter in letters_positions: for loop. I can actually manipulate the mark by doing this after the for loop.
surface.context.move_to( 100, 100 )
surface.context.save()
surface.context.text_path(' ')
surface.context.restore()
Depending on the values I enter, it will move one of the points (in this case the lower right point of the mark). This suggests that when CairoSVG is done with a line of text, the surface will move the cursor to the start of the last letter it was drawing. For some reason, the last time it does this is actually recorded into the PDF, although for most PDF viewers it does not render. I have tried to fine tune this mark so the two points are the same, but because this location is different each time depending on letter and location and more, I can't fine tune this number to make the mark disappear, even when using available width and height variables
Note that surface.context is not a file or function and surface is pprinted as <cairosvg.surface.PDFSurface object at 0x7f647802ec90>, it's some sort of Cairo interface for interacting with the PDF file directly through some sort of PDF API, although I haven't found any documentation on it.
I have tried lots of cheap work arounds, like saving the PDF in another program then opening in Corel, however elements are lost or altered in ways that are not acceptable.
I have also of course tried commenting out this file 1 line at a time, then chunks of code at a time, but with no luck.
I have also looked into other files such as path.py, parser.py, and more, but text.py seems to be the most promising location.
How can I prevent this mark from appearing in my PDF files?
I have found I could get the desired results by going into the CairoSVG code and using a different function to create the text elements onto the PDF.
This collects the letters into a variable.
# Collect entire line instead of letter
text_line_letters = ''
for [x, y, dx, dy, r], letter in letters_positions:
text_line_letters = text_line_letters + letter
# Skip rest of for loop
continue
This uses the new function to creates the actual text onto the PDF.
# Create Text with glyph_path
surface.context.save()
font = surface.context.get_scaled_font()
glyphs, clusters, is_backwards = font.text_to_glyphs(
0, -5, text_line_letters, with_clusters=True)
surface.context.glyph_path(glyphs)
surface.context.restore()
The actual root cause of the issue is unknown, but likely deeper inside the Cairo library itself.
https://github.com/bonzini/cairo/blob/9099c7e7307a39bc630919faa65bba089fd15104/src/cairo.c#L3349

get svg text size in python

I am generating SVG image in python (pure, no external libs yet). I want to know what will be a text element size, before I place it properly. Any good idea how to make it? I checked pysvg library but I saw nothing like getTextSize()
This can be be pretty complicated. To start with, you'll have to familiarize yourself with chapter on text of the SVG specification. Assuming you want to get the width of plain text elements, and not textpath elements, at a minimum you'd have to:
Parse the font selection properties, spacing properties and read the xml:space attibute, as well as the writing-mode property (can also be top-bottom instead of just left-to-right and right-to-left).
Based on the above, open the correct font, and read the glyph data and extract the widths and heights of the glyphs in your text string. Alone finding the font can be a big task, seeing the multiple places where font files can hide.
(optionally) Look through the string for possible ligatures (depending on the language), and replace them with the correct glyph if it exists in the font.
Add the widths for all the characters and spaces, the latter depending on the spacing properties and (optionally) possible kerning pairs.
A possible solution would be to use the pango library. You can find python bindings for it in py-gtk. Unfortunately, except from some examples, the documentation for the python bindings is pretty scarce. But it would take care of the details of font loading and determining the extents of a Layout.
Another way is to study the SVG renderer in your browser. But e.g. the support for SVG text in Firefox is limited.
Also instructive is to study how TeX does it, especially the concept (pdf) of boxes (for letters) and glue (for spacing).
I had this exact same problem, but I had a variable width font. I solved it by taking the text element (correct font and content) I wanted, wrote it to a svg file, and I used Inkscape installed on my PC to render the drawing to a temporary png file. I then read back the dimensions of the png file (extracted from the header), removed the temp svg and png files and used the result to place the text where I wanted and elements around it.
I found that rendering to a drawing, using a DPI of 90 seemed to give me the exact numbers I needed, or the native numbers used in svgwrite as a whole. -D is the flag to use so that only the drawable element, i.e. the text, is rendered.
os.cmd(/cygdrive/c/Program\ Files\ \(x86\)/Inkscape/inkscape.exe -f work_temp.svg -e work_temp.png -d 90 -D)
I used these functions to extract the png numbers, found at this link, note mine is corrected slightly for python3 (still working in python2)
def is_png(data):
return (data[:8] == b'\x89PNG\r\n\x1a\n'and (data[12:16] == b'IHDR'))
def get_image_info(data):
if is_png(data):
w, h = struct.unpack('>LL', data[16:24])
width = int(w)
height = int(h)
else:
raise Exception('not a png image')
return width, height
if __name__ == '__main__':
with open('foo.png', 'rb') as f:
data = f.read()
print is_png(data)
print get_image_info(data)
It's clunky, but it worked

ImageDraw.text writes through the image borders

I am trying to write text to an image. (It is part of a script that will hundreds of files and make hundreds of images)
The only problem is that the function I want to use to write text to an image, seems to ignore my string's new lines. I looked at the ImageDraw documentation in the PIL reference, but it does not even touch on this. Others seem to be using this function for small text like signatures and watermarks.
Unfortunately I need to be converting text to an image, as part of a larger Python script.
draw_image.text((0,0), formattedString, font=f)
image.save(open("image.png", "wb"), "PNG")
This above function works, it prints the string onto an image. But only on one long line that extends beyond the borders of the image. If I print this same string out onto the console, it prints with the newlines.
I need insight into a way to judge the edge of the image borders. I also have the inputFile's contents in a list format. So I need a loop through the list
fileContents = file.readlines()
for i in fileContents
draw_image.text((0,j), i, fill='white')
j = j + 20 #this is necessary for adequate spacing of lines
So at this point I need a function that knows to wrap the text when it reaches the image border.

Categories