Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 15 days ago.
The community reviewed whether to reopen this question 15 days ago and left it closed:
Original close reason(s) were not resolved
Improve this question
What are good Python-based options to create strictly designed PDF reports from HTML?
I've attached a draft PDF to illustrate the following points:
The design of the report is rather strictly designed. In other words "Looks matter".
The report contains complex vector graphics (package: Matplotlib).These may slightly differ in size .
The report contains images.
The report contains a large number of numbers / strings filled in dynamically.
Optimally, the solution would use open source packages.
We create our HTML with Django.
The report may span multiple pages.
It looks as if there was already a good amount of very diverse packages facilitating reporting. Just to name a few, there are xhtml2pdf, weasyprint, django-wkhtmltopdf.
In my experience, it’s easy with these tools to create a PDF from your content. The hard part comes when the PDF needs to fall into a highly-defined design structure as in our case. Unfortunately, I was not able to find example PDFs for the different PDF generation packages that have a highly designed structure.
What is your experience with this? Which options worked well for you? Are there well done complex examples that I’ve overlooked?
You can see this Python package: weasyprint
Web page: http://weasyprint.org/
Official doc: http://weasyprint.readthedocs.io/en/latest/
It's great, because you can generate the PDF from a web page or an html file, you can have conflicts with some CSS (which are specified in the documentation), but it provides what you need
I recently used weasyprint and jinja to do automated report generation from html. It worked well and I believe would be capable of meeting your strict format requirements. I haven't used any of the others though.
My report had images, including graphs converted to images, normal dynamically generated text, as well as large tables. All of this was constrained to an 9x11 page size. Weasyprint does a good job of pagination automatically, but also has configurability in that regard.
I found this guide to be very useful:
http://pbpython.com/pdf-reports.html
Although I think pandas is total overkill for html generation of graphs and you lose a lot of configurability using it.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm wondering if some of the automatic filtering and subsetting tools that are built into Microsoft Excel have equivalents in any package in Python or R.
For instance, let's say I want to build a tool to filter job candidates by various characteristics (for a non-technical user of the tool). In Excel, I can hit the filtering button and then automatically start subsetting a spreadsheet using multiple-choice lists, numeric ranges, free text search, etc:
I know Shiny apps in R allow you to build interactive dashboards, but (a) they don't automatically discern the type of every column of your data set, and (b) I've found the Shiny reactive triggers to be a little glitchy when repeatedly subsetting the same data.
Again, this tool is intended to be used by a non-technical user, who is trying to narrow a large data set down to a set of matching entries using filtering.
EDIT: I've just been told about the DataTables library -- specifically DT: An R interface to the DataTables library, but it also can be used in Python. I'm still curious if there are better packages out there, but this one seems like the most likely candidate.
Rstudio has this ability. If you view a data set (click it in the Environment window), there's an option to "Filter".
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm working in a project to analyze and create PDF file.
I almost done with analysis part but I reached the part of creation pdf files.
is there any library or tool in Python I could use to create pdf and also to emmbed actctions within it?
There are several libraries that can create PDF files for different purposes. A couple of examples.
The matplotlib library can generate plots in PDF format.
The Python binding to the cairo graphics library, pycairo. Very good for drawing, but less suited for typesetting.
Reportlab, written in pure Python.
Not really a library, but definitely an option;
For high-quality typesetting, you can use Python to generate (La)TeX source code, which can be converted into PDF with pdftex. This requires that you have a TeX distribution installed though.
One way to embad actions in PDF files is to use javascript. E.g. this TUGboat article describes how to include javascript in PDF files generated with LaTeX.
Note however that not all PDF viewers support this.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Hi all i want to web based GUI Testing tool. I found dogtail is written using python. but i didnot get any good tutorial and examples to move further. Please Guide me weather dogtail is perfect or something better than this in python is there?. and if please share doc and example.
My requirement:
A DVR continuous showing live video on tile(4 x 4 ), GUI is web based(mozilla) . i Should be able to swap video and check log and have to compare actual result and present.
Selenium is designed exactly for this, it allows you to control the browser in Python, and check if things are as expected (e.g check if a specific element exists, submit a form etc)
There's some more examples in the documentation
Project Sikuli is a similar tool, but is more general than just web-browsers
Selenium provides a python interface rather than just record your mouse movements, see http://selenium-python.readthedocs.org/en/latest/api.html
If you need to check your video frames your can record them locally and OCR the frames looking for some expected text or timecode.
For Simple form based UI Testing. I have created a framework using python/selenium/phantomjs although it can do complex stuff too. I am yet to document it. (If you don't need to run firefox you don't need to install phantomjs)
https://github.com/manav148/PyUIT
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I stumbled upon the wikidump python library, which I think suits me just fine.
I could get by by looking at the source code, but I'm new at python and I don't want to write BS code as the project I need it for is kind of important to me.
I got the 'wiki-SPECIFICDATE-pages-articles.xml.bz2' file and I would need to use that as my source for single article fetching. Can anyone give me some pointers as to properly achieve this or, even better, point at some documentation? I couldn't find any!
(p.s. if you got any better and properly doc'd lib, please tell me)
Not sure if I understand the question, but if you have the Wikipedia dump and you need to parse the wikicode, I would suggest mwparserfromhell lib.
Another powerful framework is Pywikibot, that is the historic framework for bot users on Wikipedia (thus, it has many scripts dedicated to writing pages, instead of reading and parsing articles). It has a lot of documentation (though, sometimes obsolete) and it uses MediaWiki API.
You can use them both, of course: PWB for fetching articles and mwparserfromhell for parsing.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am interested in doing some snail mail based surveys but I am looking for quick ways to digitize the surveys they send back.
So if I had a question and 5 boxes beneath it where you would indicate your opinion by checking the appropriate box, does anything exist where I could scan it and run it through a piece of software that spit out the responses.
Edit clarification:
I am inquiring about what I need to do after the paper has been digitized. I want to write some code that looks at an image file and recognizes which box has been marked in and outputs a representation of the respondents answers.
I would be looking at a page scanned from a desktop scanner or something similar.
From what i see you don't really need ICR (intelligent character recognition, used for handwritten and handprinted texts), but what you need is OMR - optical mark recognition (capturing human-marked data from document forms such as surveys and tests).
The bad news is you would hardly find an opensource library for python. But there's a solution - you can use a cloud SDK, it's a website that let you upload an image and send you back an OCR'ed data. Try www.ocrsdk.com, it is a cloud based OCR SDK recently launched by ABBYY. It's now in closed beta so it's completely free to use.
It has both ICR and OMR api methods and a set of python code samples.
I don't really see what this has to do with python, unless of course you've already digitized the results and are now looking to tally up the results. It sounds like you still need to scan the results in and as far as I know, python doesn't have any direct capabilities of doing something like that. You're going to have to get your hands on a scanner first, and only then can you use python to read through the data.
The SDAPS project (repo) might be worth a look. It may not handle arbitrary scanned images, as it seems to expect an ODT or LaTeX document at the beginning of the process.
Overview
SDAPS is an open source (GPLv3, LPPL) optical mark recognition (OMR) program. It is
written in python and has an integrated workflow with both LibreOffice and LaTeX to
create questionnaires.
Workflow
With SDAPS you create the questionnaire using either LibreOffice or LaTeX. After this
some processing is done to collect the information about the survey (questions, and
answers) and a printable PDF is created. The filled out questionnaires only need to be
scanned in (example). SDAPS will do the optical mark recognition and can create a PDF
report (example) or export the data. Optionally it is possible to manually correct the
results using a graphical user interface.