Adding a border to PDF from ArcMap export using Arcpy - python

I'm trying to add a border to a PDF exported from ArcMap, using arcpy. I've not been able to find the answer to this anywhere, nor does arcpy seem to have any documentation on this.
Oddly enough, the map layout from which I'm exporting already has a black border around it, but when I export to a PDF, there is no border. My code here:
#Export to PDF
currentMXD_Map = (r"myMap.mxd")
mxd_Map = arcpy.mapping.MapDocument(currentMXD_Map)
df_Map = arcpy.mapping.ListDataFrames(mxd_Map,"*")[0]
arcpy.mapping.ExportToPDF(mxd_Map, r"myMap.pdf", df_Map,
df_export_width=3300,
df_export_height=2550)
mxd_Map.save()
I would think arcpy.mapping has a method to add border to a PDF export (or in the map layout). What can I try next?

arcpy is not designed for map or layout authoring. It is designed to manipulate existing layouts or maps. Here's a quote from the documentation
The arcpy.mapping module was designed so that it can be used to modify
existing elements within already existing map documents (.mxd) or
layer files (.lyr). In other words, it helps with the automation of
existing features but it can't be used to author new objects.
The easiest way to "add" a border is to have a border already in your map layout with size set to 0 or positioned off screen and then to use arcpy to make it visible or move it where you want. It seems you already have the border so maybe it's not in the right place or is set to 0 width.
Either way, you can access the border element by giving it a name in arcmap and then accessing with ListLayoutElements.
First fill in the "Element Name" in the elements properties in arcmap. notice how I've set the height and width to 0 so that it won't be visible normally.
Then access the element with ListLayoutElements
#we want the first border element because we are assuming there is only one.
#iterate or change index depending on your scenario
borderElement = arcpy.mapping.ListLayoutElements(mxd, "GRAPHIC_ELEMENT", "border_element")[0]
borderElement.elementHeight = y
borderElement.elementWidth = x

Related

python-pptx duplicate slide PPT will be damaged

I found that when using the method of duplicate slide, if there is a chart on the page, PPT will be damaged, so I used this method to copy a slide with a chart and modify the title of the chart on one page, and the title of the chart on the other page is also modified inexplicably
def duplicate_slide(pres,index):
template = pres.slides[index]
blank_slide_layout = pres.slide_layouts[index]
copied_slide = pres.slides.add_slide(blank_slide_layout)
for shp in template.shapes:
el = shp.element
newel = copy.deepcopy(el)
copied_slide.shapes._spTree.insert_element_before(newel, 'p:extLst')
for _, value in six.iteritems(template.part.rels):
# Make sure we don't copy a notesSlide relation as that won't exist
if "notesSlide" not in value.reltype:
copied_slide.part.rels.add_relationship(
value.reltype, value._target, value.rId
)
return copied_slide
In the general case, duplicating a slide is not a simple as just cloning the slide XML, which is what your duplicate_slide() method does. That works for some simple cases, but not for slides with charts.
In particular, a chart is a separate package-part ("file") within the PPTX package (zip archive). If you just copy the relationships from one slide to the other, like you do here, then you have two slides pointing to the same chart-part. This is why changing the chart title in one slide changes it in the other as well, because the same single chart is displayed on both slides.
In order to get the behavior you seem to be looking for, you would need to also duplicate the chart part and form a relationship from the new slide to that new chart part.
That's not a simple enough process for me to just provide here a few lines of code to do it, but hopefully this explains for you why you are seeing the behavior you are.

How to find table grid lines in PDF files?

To more accurately extract table-like data embedded within table cells, I would like to be able to identify table cell boundaries in PDFs like this:
I have tried extracting such tables using Camelot, pdfplumber, and PyMuPDF, with varying degrees of success. But due to the inconsistency of the PDFs we receive, I'm not able to reliably get accurate results, even when specifying the table bounds.
I find that the results are better if I extract each table cell individually, by specifying the cell boundaries explicitly. I have tested this by manually entering the boundaries, which I get using Camelot's visual debugging tool.
My challenge is how to identify table cell boundaries programmatically, since the table may start anywhere on the page, and the cells are of variable vertical height.
It seems to me that one could do this by finding the coordinates of the row separator lines, which are so obvious visually to a human. But I have not figured out how to find these lines using python tools. Is this possible, or are there other/better ways to solve this problem?
I recently had a similar use case where I needed to figure out the boundaries via code itself. For your use case, there are two options:
If you want to identify the boundary of the entire table, you can do the following:
import pdfplumber
pdf = pdfplumber.open('file_name.pdf')
p0 = pdf.pages[req_page] # go to the required page
tables = p0.debug_tablefinder() # list of tables which pdfplumber identifies
req_table = tables.tables[i] # Suppose you want to use ith table
req_table.bbox # gives you the bounding box of the table (coordinates)
You want to visit each cell in the table and extract, say words, from them:
import pdfplumber
pdf = pdfplumber.open('file_name.pdf')
p0 = pdf.pages[req_page] # go to the required page
tables = p0.debug_tablefinder() # list of tables which pdfplumber identifies
req_table = tables.tables[i] # Suppose you want to use ith table
cells = req_table.cells # gives list of all cells in that table
for cell in cells[i:j]: # iterating through the required cells
p0.crop(cell).extract_words() # extract the words

Change Color of Point Object Using QtGui.QPainter() in Python

I'm developing a map visualization program in Python using several modules from qtpy. There is a main window interface which displays a background map containing several geolocated points on the screen. The location of each point is determined by an external .csv file that has information regarding the latitude, longitude, and other text attribution. This file gets read-in by the program each time the map window is instantiated. The color of each point defaults to red when the map window is opened, but I would like to have each point change to a different color based on its metadata stored in the .csv file. For instance, there is a header in the file called "color", and each point has the text string "red", "green" or "blue" encoded. Here is the section of code I've been working on so far...
# Initialize all points to default color.
color = QtCore.Qt.red
for i, p in zip(range(len(self.points)), self.points):
if lb_lat <= stn_lat and stn_lat <= ub_lat and window_rect.contains(*self.transform.map(stn_x, stn_y)):
if p['color'] == 'green':
color = QtCore.Qt.green
elif p['color'] == 'blue':
color = QtCore.Qt.blue
elif p['color'] == 'red':
color = QtCore.Qt.red
else:
color = QtCore.Qt.white
qp.setPen(QtGui.QPen(color, self.scale))
qp.setBrush(QtGui.QBrush(color))
qp.drawEllipse(QtCore.QPointF(stn_x, stn_y), size, size)
The list of points is stored in the variable self.points and I'm trying to iterate through this list and apply the correct color to each point using QtGui.QPen and QBrush. What is happening is that if the color attribute in the .csv file for point 1 has the text string "green", then the entire array of points changes to green instead of just that one point. Looking at the code after the if...else statements, I haven't been able to find a way to "index" the setPen and setBrush commands for just the point in question. The coloring methods are acting on the entire array of points as one indivisible unit instead of working on each point separately as intended. Would anyone perhaps know of a way to do this using the Qt framework? Please let me know if supplying additional code might help clarify the problem or give better context as I'd be happy to do that.
I was able to solve the issue I had by removing the looping construct where I was iterating through the items in self.points. I had a higher-level "for" loop already in place and this was causing the incorrect array index to be referenced each time the points were being drawn to the screen. Each point is now changing to the appropriate color.

How to find the real first line of a shape in pptx presentation

At the moment I need to get the actual width of the text, the best solution that I tried was to find the first line of text, and get its width.
The presentations given to me were made by different people, and I cannot directly influence them. And it turns out that the shape frame itself in the presentation is often much wider than the text, which is a problem, because I need to get visible text shape collisions, which possible only when i have real frame of text(tried to show it in the screenshot1, screenshot2)
My best try to get real first line is:
# i have already compiled PIL font, with font typeface and size named "font"
# width of shape and text of shape
# and also class TextWrapper that wraps given string with given width and outputs a list
# get first line from wrapped text, and from tuple select width
width_first = font.getsize(TextWrapper(shape.text, font, width).text_lines[0])[0]
# get lines that wrapped at width first string
wrapped_lines = TextWrapper(shape.text, font, width_first).text_lines
# ... some calculations here
Problem that a wrap symbol like '\n' not always be in text, but in presentation i see that wrap.
I tried to explain how I could, did anyone come across this at all?
EDIT:
I found some way how to do thig that i need.
If u need something equals, see code.
import win32com.client
Application = win32com.client.Dispatch("PowerPoint.Application")
# WithWindow=False forces PowerPoint to do not open the PowerPoint Window
Presentation = Application.Presentations.Open("ABCPATH/to/presentation.pptx",
WithWindow=False)
for Slide in Presentation.Slides:
for Shape in Slide.Shapes:
if Shape.HasTextFrame: # checks if shape has text, becouse we avoiding imgs
# that what we need
first_line = Shape.TextFrame.TextRange.Lines(1, 1)
You must have installed PowerPoint Application to do that, and installed pywin32
And it works only on Windows, so thats not so good choose, but for me it works perfect
Maybe someone found this usable

ReportLab Two-Column TOC?

I have a PDF I am generating using ReportLab. I am using the standard TableOfContents flowable, but am trying to split it up into two columns, so it will all fit on the first page. the content will only ever be on one level, so I am not worried about odd-looking indentations.
Right now I have the PageTemplate using 2 Frames to create 2 columns on the first page. I get a
LayoutError: Flowable <TableOfContents at 0x.... frame=RightCol>...(200.5 x 720) too large on page 1 in frame 'RightCol'(200.5 x 708.0*)
Any ideas?
Well, color me embarrassed.
For anyone else having this problem, check your DocTemplate for allowSplitting. The default is 1, but I had changed mine to 0 and that was the reason.
*facepalm*

Categories