I'm automatically generating a PDF-file with Platypus that has dynamic content.
This means that it might happen that the length of the text content (which is directly at the bottom of the pdf-file) may vary.
However, it might happen that a page break is done in cases where the content is too long.
This is because i use a "static" spacer:
s = Spacer(width=0, height=23.5*cm)
as i always want to have only one page, I somehow need to dynamically set the height of the Spacer, so that the "rest" of the space that is left on the page is taken by the Spacer as its height.
Now, how do i get the "rest" of height that is left on my page?
I sniffed around in the reportlab library a bit and found the following:
Basically, I decided to use a frame into which the flowables will be printed. f._aH returns the height of the Frame (we could also calculate this by hand). Subtracting the heights of the other two flowables, which we get through wrap, we get the remaining height which is the height of the Spacer.
elements.append(Flowable1)
elements.append(Flowable2)
c = Canvas(path)
f = Frame(fx, fy,fw,fh,showBoundary=0)
# compute the available height for the spacer
sheight = f._aH - (Flowable1.wrap(f._aW,f._aH)[1] + Flowable2.wrap(f._aW,f._aH)[1])
# create spacer
s = Spacer(width=0, height=sheight)
# insert the spacer between the two flowables
elements.insert(1,s)
# create a frame from the list of elements
f.addFromList(elements,c)
c.save()
tested and works fine.
As far as i can see you want to have footer, right?
Then you should do it like:
def _laterPages(canvas, doc):
canvas.drawImage(os.path.join(settings.PROJECT_ROOT, 'templates/documents/pics/footer.png'), left_margin, bottom_margin - 0.5*cm, frame_width, 0.5*cm)
doc = BaseDocTemplate(filename,showBoundary=False)
doc.multiBuild(flowble elements, _firstPage, _laterPages)
Related
I am new to python. I am trying to extract mixed fractions from pdf file using Python. But I have no idea which tool I should use to extract. My sample pdf contains only one page with simple text. I would like to extract Part name and length of part using Python. Screenshot of sample pdf page is as shown in image link Page 1 of Pdf- Screenshot. Pdf file can be downloaded from the following link (Sample Pdf)
EDIT 1: - UPDATED
Thank you for suggesting Pdfplumber. It is a great tool. I could extract information with it. Though in some cases, when I extract length, I get the whole number combined with denominator. Say, if I have 36 1/2 as length (as shown in screenshot), then I get the value as 362 inches.
import pdfplumber
with pdfplumber.open("Sample.pdf") as pdf:
first_page = pdf.pages[0]
text = first_page.extract_text()
for row in text.split('\n'):
if 'inches' in row:
num = row.split()[0]
print(num)
Output: 362
This code works for me in most cases. Just in some cases, I get 362 as my output, instead of getting 36 as a separate value. How could I resolve this issue?
pdfplumber gives output like that
shape: square
part name: square
1
36 𝑖𝑛𝑐ℎ𝑒𝑠
2
I would suggest to use PDF Pluber, it's a very powerful and well documented tool for extracting text, table, images from PDFs.
Moreover, it has a very convenient function, called crop, that allows you to crop and extract just the portion of the page that you need.
Just as an example, the code would be something like this (note that this will work with any number of pages):
filename = 'path/to/your/PDF'
crop_coords = [x0, top, x1, bottom]
text = ''
pages = []
with pdfplumber.open(filename) as pdf:
for i, page in enumerate(pdf.pages):
my_width = page.width
my_height = page.height
# Crop pages
my_bbox = (crop_coords[0]*float(my_width), crop_coords[1]*float(my_height), crop_coords[2]*float(my_width), crop_coords[3]*float(my_height))
page_crop = page.crop(bbox=my_bbox)
text = text+str(page_crop.extract_text()).lower()
pages.append(page_crop)
Here is the explanation of coords:
x0 = % Distance from left vertical cut to left side of page.
top = % Distance from upper horizontal cut to upper side of page.
x1 = % Distance from right vertical cut to right side of page.
bottom = % Distance from lower horizontal cut to lower side of page.
I have a pie chart with multiple segments in it. I want certain pie segment to start from an angle I provide.
Pie chart "rotation" features in PowerPoint are limited to positioning the "start" angle of the first segment, with segments growing in the clockwise direction only.
So you can specify that the first element appears at 70-degrees (clockwise from the 12-o'clock position).
Currently there is no API support for this, but the value is in the XML at the location mentioned by #Saleh above: /c:chartSpace/c:chart/c:plotArea/c:pieChart/c:firstSliceAng
You can access the c:doughnutChart element on:
chart.plots[0]._element
And print it with:
print(chart.plots[0]._element.xml)
If it happens to already have a c:firstSliceAng element on it, you can just change the setting, perhaps something like this:
pieChart = chart.plots[0]._element
firstSliceAng = pieChart.xpath("./c:firstSliceAng")
firstSliceAng["val"] = "70"
If there is no firstSliceAng element there you need to use lxml calls to add it first.
Below are the steps to change firstSliceAng of piechart in python-pptx:
Check whether firstSliceAng tags exist or not:
firstSliceAng = pieChart.xpath("./c:firstSliceAng")
print(firstSliceAng)
Expected output:
[<some-object>]
If the list is blank, then it means you need to add firstSliceAng using lxml or oxml
To add it with lxml use following steps:
tag = tags[0]
child = OxmlElement('c:firstSliceAng')
# keep in mind that **start_angle** should always be string whose value is int and not float
**start_angle = str(int(75.55))**
child.set('val', start_angle+45)
tag.addprevious(child)
To add it with lxml use following steps:
doc = etree.parse(StringIO(xml))
root = doc.getroot()
# keep in mind that **start_angle** should always be string whose value is int and not float
**start_angle = str(int(75.55))**
c = Element(QName(root.nsmap['c'], 'firstSliceAng'), val=start_angle)
present_element = chart.plots[0]._element.xpath('c:varyColors')[0]
present_element.addprevious(c)```
I am using the python-pptx library for pptx manipulation. I want to add a bullet list in the pptx document.
I am using the following snippet to add list item:
p = text_frame.add_paragraph()
run = p.add_run()
p.level = 0
run.text = "First"
But it does not display bullet points; please guide.
It is currently not possible to access the bullet property using python-pptx, but I want to share a workaround that has served me well.
This requires the use of a pptx template, in which we exploit the fact that the levels in a slide layout can be customized individually.
For instance, in the slide layout you could set level 0 to be normal text, level 1 to be bullets, and level 2 to be numbers or any other list style you want. You can then modify font size, indentation (using the ruler at the top), and any other property of each level to get the look you want.
For my use-case, I just set levels 1 and 2 to have the same indentation and size as level 0, making it possible to create bullet lists and numbered lists by simply setting the level to the corresponding value.
This is how my slide layout looks in the template file:
slide layout example
And this is how I set the corresponding list style in the code:
p.level = 0 # Regular text
p.level = 1 # Bullet
p.level = 2 # Numbers
In theory, you should be able to set it up exactly the way you want, even with indented sub-lists and so on. The only limitation I am aware of is that there seems to be a maximum of 8 levels that can be customized in the slide layout.
My solution:
from pptx.oxml.xmlchemy import OxmlElement
def SubElement(parent, tagname, **kwargs):
element = OxmlElement(tagname)
element.attrib.update(kwargs)
parent.append(element)
return element
def makeParaBulletPointed(para):
"""Bullets are set to Arial,
actual text can be a different font"""
pPr = para._p.get_or_add_pPr()
## Set marL and indent attributes
pPr.set('marL','171450')
pPr.set('indent','171450')
## Add buFont
_ = SubElement(parent=pPr,
tagname="a:buFont",
typeface="Arial",
panose="020B0604020202020204",
pitchFamily="34",
charset="0"
)
## Add buChar
_ = SubElement(parent=pPr,
tagname='a:buChar',
char="•")
This question is still up to date on May 27, 2021.
Following up on #OD1995's answer I would like to add a little more detail as well as my turn on the problem.
I created a new package with the following code:
from pptx.oxml.xmlchemy import OxmlElement
def getBulletInfo(paragraph, run=None):
"""Returns the attributes of the given <a:pPr> OxmlElement
as well as its runs font-size.
*param: paragraph* pptx _paragraph object
*param: run* [optional] specific _run object
"""
pPr = paragraph._p.get_or_add_pPr()
if run is None:
run = paragraph.runs[0]
p_info = {
"marL": pPr.attrib['marL'],
"indent": pPr.attrib['indent'],
"level": paragraph.level,
"fontName": run.font.name,
"fontSize": run.font.size,
}
return p_info
def SubElement(parent, tagname, **kwargs):
"""Helper for Paragraph bullet Point
"""
element = OxmlElement(tagname)
element.attrib.update(kwargs)
parent.append(element)
return element
def pBullet(
paragraph, # paragraph object
font, # fontName of that needs to be applied to bullet
marL='864000',
indent='-322920',
size='350000' # fontSize (in )
):
"""Bullets are set to Arial,
actual text can be a different font
"""
pPr = paragraph._p.get_or_add_pPr()
# Set marL and indent attributes
# Indent is the space between the bullet and the text.
pPr.set('marL', marL)
pPr.set('indent', indent)
# Add buFont
_ = SubElement(parent=pPr,
tagname="a:buSzPct",
val="350000"
)
_ = SubElement(parent=pPr,
tagname="a:buFont",
typeface=font,
# panose="020B0604020202020204",
# pitchFamily="34",
# charset="0"
)
# Add buChar
_ = SubElement(parent=pPr,
tagname='a:buChar',
char="•"
)
The reason I did this is because I was frustrated that the bullet character was not of the same size as the original and the text was stuck to the bullet.
getBulletInfo() allows me to retrieve information from an existing paragraph.
I use this information to populate the element's attributes (so that it is identical to the template).
Anyways the main add-on is the creation of a sub-element <a:buSzPct> (documentation here and here). This is a size percentage that can go from 25% to 350% (100000 = 100%).
Try this:
p = text_frame.add_paragraph()
p.level = 0
p.text = "First"
Or if the text_frame already has a paragraph:
p = text_frame.paragraphs[0]
p.level = 0
p.text = "First"
max_page_name = self.ui.p_tree.sizeHintForColumn(0) + 2*self.ui.p_tree.frameWidth()
The above code gives the size of the Tree Widget from contents but considers only the top level items. How can I get the size considering all the items including sub-items?
Right now, I am using a work-around by doing
self.ui.p_tree.expandAll()
max_page_name = self.ui.p_tree.sizeHintForColumn(0) + 2*self.ui.p_tree.frameWidth()
self.ui.p_tree.collapseAll()
self.ui.p_tree.setMinimumWidth(max_page_name)
I have a wx.ListCtrl in REPORT mode and i use an image list to display icons which are 50x50 pixels with SetItemColumnImage. The problem now is that the text I display in the column right of the icon is less than 50 pixels high and the parts of the icons that are higher than the text are cut off.
Is there a way to tell ListCtrl to adjust the row height to the height of the icons? Last refuge would be to change the fontsize of the text, but there should be a better way.
Update:
Here is some of my code:
self.list = util.ListCtrl(nb, style=wx.LC_REPORT|
wx.LC_SINGLE_SEL|wx.LC_NO_HEADER|wx.LC_ALIGN_LEFT)
self.list.InsertColumn(0, 'Avatar', width=-1)
self.list.InsertColumn(1, 'Name', width=-1)
self.list.SetColumnWidth(0, 50)
self.imagelist = wx.ImageList(50, 50, 255, 20)
self.list.SetImageList(self.imagelist, wx.IMAGE_LIST_SMALL)
i = 0
for user in self.users:
self.list.Append(['', user['name']])
if user['avatar']:
bitmap = wx.BitmapFromImage(user['avatar'])
imageidx = self.imagelist.Add(bitmap)
self.list.SetItemColumnImage(i, 0, imageidx)
i += 1
When I remove the LC_REPORT flag the images are completely visible but they are all displayed in one row and the names aren't visible anymore.
Since the images are 50x50, I don't think they qualify as "small" any more. Try using the wx.IMAGE_LIST_NORMAL instead of wx.IMAGE_LIST_SMALL. I can't find anything about manually setting row height, so I'm guessing that's not possible. However, I did find a bug report on this topic that says it was resolved in wx2.9. Are you using 2.9?
Alternatively, you could use the UltimateListCtrl which is pure Python and if it doesn't have that ability, you can probably get it patched quickly as the author is very responsive.
Took me a couple cups of coffee to figure it out.
The call to ImageList.Add should precede ListCtrl.Append (or ListCtrl.InsertItem) in order for the ListCtrl to change the height of its rows according to the height of images in ImageList.
So instead of
for user in self.users:
self.list.Append(['', user['name']])
if user['avatar']:
bitmap = wx.BitmapFromImage(user['avatar'])
imageidx = self.imagelist.Add(bitmap)
self.list.SetItemColumnImage(i, 0, imageidx)
You should go with something like this
for user in self.users:
if user['avatar']:
bitmap = wx.BitmapFromImage(user['avatar'])
imageidx = self.imagelist.Add(bitmap)
self.list.Append(['', user['name']])
if user['avatar']:
self.list.SetItemColumnImage(i, 0, imageidx)
Which looks ugly, until you implement a default avatar:
def_avatar = 'default_avatar.jpg'
for user in self.users:
bitmap = wx.BitmapFromImage(user['avatar'] if user['avatar'] else def_avatar)
imageidx = self.imagelist.Add(bitmap)
self.list.Append(['', user['name']])
self.list.SetItemColumnImage(i, 0, imageidx)