Provide tab title with reportlab generated pdf - python

This question is really simple, but I can't find any data on it.
When I generate a pdf with reportlab, passing the httpresponse as a file, browsers that are configured to show files display the pdf correctly. However, the title of the tab remains "(Anonymous) 127.0.0.1/whatnot", which is kinda ugly for the user.
Since most sites are able to somehow display an appropiate title, I think it's doable... Is there some sort of title parameter that I can pass to the pdf? Or some header for the response? This is my code:
def render_pdf_report(self, context, file_name):
response = HttpResponse(content_type='application/pdf')
response['Content-Disposition'] = 'filename="{}"'.format(file_name)
document = BaseDocTemplate(response, **self.get_create_document_kwargs())
# pdf generation code
document.build(story)
return response

Seems that Google Chrome doesn't display the PDF titles at all.
I tested the link in your comment (biblioteca.org.ar) and it displays in Firefox as " - 211756.pdf", seems there's an empty title and Firefox then just displays the filename instead of the full URL path.
I reproduced the same behaviour using this piece of code:
from reportlab.pdfgen import canvas
c = canvas.Canvas("hello.pdf")
c.setTitle("hello stackoverflow")
c.drawString(100, 750, "Welcome to Reportlab!")
c.save()
Opening it in Firefox yields the needed result:
I found out about setTitle in ReportLab's User Guide. It has it listed on page 16. :)

I was also looking for this and I found this in the source code.
reportlab/src/reportlab/platypus/doctemplate.py
# line - 467
We can set the document's title by
document.title = 'Sample Title'

I realise this is an old question but dropping in an answer for anyone using SimpleDocTemplate. The title property can be set in constructor of SimpleDocTemplate as a kwarg. e.g.
doc = SimpleDocTemplate(pdf_bytes, title="my_pdf_title")

If you are using trml2pdf, you will need to add the "title" attribute in the template tag, ie., <template title="Invoices" ...

In addition to what others have said, you can use
Canvas.setTitle("yourtitle")
which shows up fine in chrome.

Related

Why downloading Facebook images with requests.get() gives corrupted files?

I am a very new to Python and Facebook Graph API and hope you can help me out with this:
I have writted (in Python) a peace of code that uploads images tu a page on facebook (in a post, so it contains some text too) and this works as expected. Now I am trying to write a peace of code capable of downloading thhe image inside a post (given post_id). Unfortunately I always get "file corrupted " error.
Here is the code I use to download the image:
# this function uploads an image from a web url
ret = upload_img_to_fb(url)
# decode JSON from the request
post_id = json.loads(ret)
ret = get_json_from_fb_postID(post_id['post_id'])
perm_url = json.loads(ret)
print('Perm url = ' + perm_url['permalink_url'] + '\n')
img_response = requests.get(perm_url['permalink_url'])
image = open("foto4.jpg","wb")
image.write(img_response.content)
image.close()
Now, the print will print the following:
Perm url = https://www.facebook.com/102956292308044/photos/a.103716555565351/107173241886349/?type=3
which, acording to what I understood makes everything wrong because it is not a picture, even if a picture is displayed on the screen. So I right clicked the pic and opened it's link and I got:
https://scontent.fbzo1-2.fna.fbcdn.net/v/t39.30808-6/273008252_107171558553184_3697853178128736286_n.jpg?_nc_cat=103&ccb=1-5&_nc_sid=9267fe&_nc_ohc=d0ZvWSTzIogAX-PsuzN&_nc_ht=scontent.fbzo1-2.fna&oh=00_AT8GWh0wDHgB6tGCzHuPE2VZFus9EgWhllaJfVkZ-Nqtow&oe=620465E4
and if I pass this last link as parameter to img_response = requests.get() it works.
How do I get around this?

dropbox.exceptions.ApiError: ... CreateFileRequestError('validation_error', None)

I want to get and save the URL of an image to a variable via dropbox API in python. I'm following this guide but I get the error shown in the title.
I searched for the function dbx.file_requests_create and probably doing something wrong with title or destination. Should the title be some existing source? Because I just set it by myself.
import dropbox
dbx = dropbox.Dropbox('Y2_M...aVP')
req = dbx.file_requests_create(title='Images', destination='/C:/Users/Dropbox/Apps/myProject/image.jpg')
print req.url
print req.id
EDIT: I found this link FileRequestError. It says:
There was an error validating the request. For example, the title was invalid, or there were disallowed characters in the destination path.
EDIT-2 [SOLVED]: Thanks to Aran-Fey and Greg for their comments, I solved the problem by replacing req = dbx.file_requests_create(title='Images', destination='/C:/Users/Dropbox/Apps/myProject/image.jpg') with
req = dbx.sharing_create_shared_link_with_settings('/image.jpg', settings=None)
Also, for people who have problem with getting the image when sharing it, just change the last character of the link from 0 to 1 as mentioned on this and this.
You can add this line to the end newURL = req.url[:-1] + "1" to solve the matter.

Read form fields in a PDF created by Adobe LiveCycle Designer

How to get the fields from this PDF file? It is a dynamic PDF created by Adobe LiveCycle Designer. If you open the link in a web browser, you will probably see a single page starting from 'Please wait...' If you download the file and open it via Adobe Reader (5.0 or higher), you should see all 8 pages.
So, when reading via PyPDF2, you get an empty dictionary because it renders the file as a single page like that you see via a web browser.
def print_fields(path):
from PyPDF2 import PdfFileReader
reader = PdfFileReader(str(path))
fields = reader.getFields()
print(fields)
You can use Java-dependent library tika to read the contents for all 8 pages. However the results are messy and I am avoiding Java dependency.
def read_via_tika(path):
from tika import parser
raw = parser.from_file(str(path))
content = raw['content']
print(content)
So, basically, I can manually Edit -> Form Options -> Export Data… in Adobe Actobat DC to get a nice XML. Similarly, I need to get the nice form fields and their values via Python.
Thanks to this awesome answer, I managed to retrieve the fields using pdfminer.six.
Navigate through Catalog > AcroForm > XFA, then pdfminer.pdftypes.resolve1 the object right after b'datasets' element in the list.
In my case, the following code worked (source: ankur garg)
import PyPDF2 as pypdf
def findInDict(needle, haystack):
for key in haystack.keys():
try:
value=haystack[key]
except:
continue
if key==needle:
return value
if isinstance(value,dict):
x=findInDict(needle,value)
if x is not None:
return x
pdfobject=open('CTRX_filled.pdf','rb')
pdf=pypdf.PdfFileReader(pdfobject)
xfa=findInDict('/XFA',pdf.resolvedObjects)
xml=xfa[7].getObject().getData()

Download a Google Sites page Content Feed using gdata-python-client

My final goal is import some data from Google Site pages.
I'm trying to use gdata-python-client (v2.0.17) to download a specific Content Feed:
self.client = gdata.sites.client.SitesClient(source=SOURCE_APP_NAME)
self.client.client_login(USERNAME, PASSWORD, source=SOURCE_APP_NAME, service=self.client.auth_service)
self.client.site = SITE
self.client.domain = DOMAIN
uri = '%s?path=%s' % (self.client.MakeContentFeedUri(), '[PAGE PATH]')
feed = self.client.GetContentFeed(uri=uri)
entry = feed.entry[0]
...
Resulted entry.content has a page content in xhtml format. But this tree doesn't content any plan text data from a page. Only html page struct and links.
For example my test page has
<div>Some text</div>
ContentFeed entry has only div node with text=None.
I have debugged gdata-python-client request/response and checked resolved data from server in raw buffer - any plan text data in content. Hence it is a Google API bug.
May be there is some workaround? May be i can use some common request parameter? What's going wrong here?
This code works for me against a Google Apps domain and gdata 2.0.17:
import atom.data
import gdata.sites.client
import gdata.sites.data
client = gdata.sites.client.SitesClient(source='yourCo-yourAppName-v1', site='examplesite', domain='example.com')
client.ClientLogin('admin#example.com', 'examplepassword', client.source);
uri = '%s?path=%s' % (client.MakeContentFeedUri(), '/home')
feed = client.GetContentFeed(uri=uri)
entry = feed.entry[0]
print entry
Given, it's pretty much identical to yours, but it might help you prove or disprove something. Good luck!

Pylons: How to write a custom 404 page?

So this question has been asked before, but not answered in great detail.
I want to override the default Pylons error page to make a nicer, custom one. I've got as far as overwriting the controller in error.py, as follows:
def document(self):
"""Render the error document"""
resp = request.environ.get('pylons.original_response')
content = literal(resp.body) or cgi.escape(request.GET.get('message', ''))
custom_error_template = literal("""\
# some brief HTML here
""")
page = custom_error_template % \
dict(prefix=request.environ.get('SCRIPT_NAME', ''),
code=cgi.escape(request.GET.get('code', str(resp.status_int))),
message=content)
return page
This works OK. What I'd like to do now is use a template in the templates directory, so that the 404 page can inherit my usual layout template, CSS etc.
(I know this is a bad idea for 500 errors - I'll check in error.py that the code is 404 before I use the template rather than a literal.)
So, here's the question. How do I define custom_error_template to point at my template, rather than at a literal?
You should be able to use render method (import it from yourapp.lib.base, and use return render('/path/to/error/template').
Just create your view and use add_notfound_view configuration method to configure it.
See: http://docs.pylonsproject.org/docs/pyramid/en/latest/api/config.html?highlight=document%20error#pyramid.config.Configurator.add_notfound_view

Categories