Convert CVS/SVN to a Programming Snippets Site - python

I use cvs to maintain all my python snippets, notes, c, c++ code. As the hosting provider provides a public web- server also, I was thinking that I should convert the cvs automatically to a programming snippets website.
cvsweb is not what I mean.
doxygen is for a complete project and to browse the self-referencing codes online.I think doxygen is more like web based ctags.
I tried with rest2web, it is requires that I write /restweb headers and files to be .txt files and it will interfere with the programming language syntax.
An approach I have thought is:
1) run source-hightlight and create .html pages for all the scripts.
2) now write a script to index those script .htmls and create webpage.
3) Create the website of those pages.
before proceeding, I thought I shall discuss here, if the members have any suggestion.
What do do, when you want to maintain your snippets and notes in cvs and also auto generate it into a good website. I like rest2web for converting notes to html.

Run Trac on the server linked to the (svn) repository. The Trac wiki can conveniently refer to files and changesets. You get TODO tickets, too.

enscript or pygmentize (part of pygments) can be used to convert code to HTML. You can use a custom header or footer to link to the actual code for download.

I finally settled for rest2web. I had to do the following.
Use a separate python script to recursively copy the files in the CVS to a separate directory.
Added extra files index.txt and template.txt to all the directories which I wanted to be in the webpage.
The best thing about rest2web is that it supports python scripting within the template.txt, so I just ran a loop of the contents and indexed them in the page.
There is still lot more to go to automate the entire process. For eg. Inline viewing of programs and colorization, which I think can be done with some more trials.
I have the completed website here, It is called uthcode.

Related

Script to download file without clicking a button

The website Download the GLEIF Golden Copy and Delta Files.
has buttons that download data that I want to retrieve automatically with a python script. Usually when I want to download a file, I use mget or similar, but that will not work here (at least I don't think it will).
For some reason I cannot fathom, the producers of the data seem to want to force one to manually download the files. I really need to automate this to reduce the number of steps for my users (and frankly for me), since there are a great many files in addition to these and I want to automate as many as possible (all of them).
So my question is this - is there some kind of python package for doing this sort of thing? If not a python package, is there perhaps some other tool that is useful for it? I have to believe this is a common annoyance.
Yup, you can use BeautifulSoup to scrape the URLs then download them with requests.

Tableau package for R

I have spent a lot of time searching for an answer to this, but have not found anything yet.
Are there any Tableau packages for R? I would like to be able to create Tableau charts (or a tde file) as an output from my R script. It seems like you can run some basic R scripts from within Tableau, but that will not work for my case.
This is one of the few links I have found relevant to what I am looking for:
R-bloggers link
Would it matter if I were trying to go from Python -> Tableau?
While there are options for creating TDEs (as you've already found) and for working administration of Tableau Server, at the moment there are no open interfaces for creating Tableau workbooks from outside of Tableau proper. This is currently one of my biggest annoyances with the software.
The closest you're going to be able to get is to use the Python Tableau Document API (https://github.com/tableau/document-api-python). While that does not allow creating workbooks from scratch, it does have the most robust document editing capabilities of the various options (https://community.tableau.com/community/developers) available. The README on the Python doc API suggest that document creation may be coming. Sounds like something to watch.

Can I use doxygen to document a command-line program?

I contribute to a large code project that uses Doxygen to document a series of C libraries. We are also starting to use doxygen with doxypy for associated python modules.
Is there an easy way to document command-line programs (in python or C), and their command line options, (automatically) using doxygen?
In order to generate man pages, you need to set GENERATE_MAN tag to Yes (.doxyfile).
By default, a sub-folder named man is created within the directory provided using OUTPUT_DIRECTORY to contain the pages generated.
By doing that, doxygen will render all the markup you added to the source code as a man page (one page for each translation unit).
At this point, you might want to exclude certain parts you want to ignore (I assume you are interested in showing only how to call the main) using the exclude* directives.
I advise you to compile two different doxyfiles: one for internal usage (complete javadoc-like documentation), the other for producing the program man and the like.
Of course, you will not get the expected result at the first try and you might need to play with doxygen markup a bit.

migrating data from tomcat .dbx files

I want to migrate data from an old Tomcat/Jetty website to a new one which runs on Python & Django. Ideally I would like to populate the new website by directly reading the data from the old database and storing them in the new one.
Problem is that the database I was given comes in the form of a bunch of WEB-INF/data/*.dbx and I didn't find any way to read them. So, I have a few questions.
Which format do the WEB-INF/data/*.dbx use?
Is there a python module for directly reading from the WEB-INF/data/*.dbx files?
Is there some external tool for dumpint the WEB-INF/data/*.dbx to an ascii format that will be parsable by python?
If someone has attempted a similar data migration, how does it compare against scraping the data from the old website? (assuming that all important data can be scraped)
Thanks!
The ".dbx" suffix has been used by various softwares over the years so it could be almost anything. The only way to know what you really have here is to browse the source code of the legacy java app (or the relevant doc or ask the author etc).
wrt/ scraping, it's probably going to be a lot of a pain for not much results, depending on the app.

Output PCL from Word document using Python

I'm building a web application which will include functionality that takes MS Word (and possibly input from a web-based rich text editor) documents, substitutes values into the formfield placeholders in those documents, and generates a PCL document as output.
I'm developing in python and django on windows, but this whole solution will need to be deployed to a web host (yet to be chosen), which in practice means that the solution will need to run on linux.
I'm open to linux-only solutions if that's the only way. I'm open to solutions that involve talking to a server written in another language. I am able to write C++ or java if necessary to get this done. The final output does have to be in PCL format.
My question is: what is a good tool chain for generating PCL from word documents using python?
I am considering using some kind of interface to openoffice to open the word documents, do the substitutions, and send the output to some kind of printer driver. Does anyone have experience with this? What libraries would you recommend?
Options for interfacing that I have identified include the following; any other suggestions would be greatly welcomed:
Ulif.openoffice: http://pypi.python.org/pypi/ulif.openoffice/0.4
Py3o.renderserver: https://bitbucket.org/faide/py3o.renderserver
OpenOffice-python: http://openoffice-python.origo.ethz.ch/
A second approach would be to use something like paradocx ( https://bitbucket.org/yougov/paradocx/wiki/Home ) to open the word files, do the substitutions using that in python, then somehow interface with something that can output PCL. Again, any experience or comments on this approach would be appreciated.
I will very much appreciate any comments on tools and toolchains, and ideas or recipes that you may have.
This question covers similar ground to, but is not the same as: How to Create PCL file from MS word
Ghostscript can read PS (Postscript) or PDF and create PCL. You can use python libraries or just subprocess....
OK, so my final solution involved creating a java webservice to perform my transcoding.
Docx4j provides a class org.docx4j.convert.out.pdf.viaXSLFO.Conversion which hooks into apache FOP to convert Docx to PDF; that can be easily hacked to convert to PCL (because FOP outputs PCL)
Spark is a lightweight java web framework which allowed me to wrap my transcoder in a web service
Because I also manipulate the document, I need to have some metadata, so the perfect thing is a multipart form. I decode that using Apache Fileupload
In almost all cases, I had to upgrade to the development versions of libraries to get this to work.
On the python side I use:
requests to communicate with the web service
poster to prepare the multi-part request

Categories