Transforming DTD's with Python [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm looking for a library to help me parse and transform DTDs using Python. The only thing I have found so far is xmlproc, but that seems ancient and doesn't seem to support serialization of DTDs. There's this for Java but I'd prefer a Python solution.
Edit: by "serialization" of DTDs I mean that ideally I'd like to be able to parse the DTD to some kind of Python structure, operate on that structure and then write out the result back to a DTD.

I don't know of an end-to-end processor for DTDs, but then again I so rarely use DTDs at all so that's not surprising.
Amara can parse DTDs, but I don't know what level of access you can have to them or if the results can be serialized. I assume they can, but that's not based in reality. libxml2, which is available in Python as lxml is something else to investigate, but I have even less experience with that. It seems from the libxml documentation that you would have access to the full DTD.
Another possibility is to convert the DTD to XSD with one of many programs then use a regular XML processor to manipulate the tree, and return it back to DTD. I worry about how lossy that might be.
At an increasing level of difficulty, if you're going to write a parser yourself for the DTD grammar, consider PyParsing or PLY.

You might want to consider converting your DTD to one of the XML-based formats. At that point, you can process it with ElementTree, or whatever XML toolkit you prefer.
I've had good experience with RelaxNG, which is fairly concise and straightforward. There's a list of conversion tools on its site: http://relaxng.org/#conversion
If you prefer XML Schema, here's what is available: http://www.w3.org/XML/Schema
If you're dealing with third-party documents or DTDs, this may not work for you. If it's in-house, give it a shot. XML-based schemas are much more pleasant to work with.

Related

Audio Manipulation [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I'm looking for a Python library that will allow me to record, manipulate, and merge audio files. Most of the ones I've seen don't support Windows and/or are outdated. Does anyone have any suggestions for libraries or how these functions could be implemented with the standard python library?
Recording and Manipulating are generally different problems. For both of these, I stick with .wav file formats since (at least in their simpler forms) they are basically just the raw data with a minimal header and are easy to work with.
Recording: I use pyaudio, which provides bindings to the portaudio library.
Manipulation: For simple things I use audioop which is included in the basics Python installation, and for more complex things I go straight to scipy (which can read in many .wav files with scipy.io.wavfile.read) and then manipulate the data like any other time-series data. scipy is powerful and fast, but doesn't offer many audio specific tools nor does it present things in an audio specific terminology.
There are other things out there, though less well established, such as Snack, Audiere, and AudioLazy, are tools I've heard of bet never used, and I don't know which are still available, or their level of development, etc.

Looking for example or documentation for wikidump python lib [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I stumbled upon the wikidump python library, which I think suits me just fine.
I could get by by looking at the source code, but I'm new at python and I don't want to write BS code as the project I need it for is kind of important to me.
I got the 'wiki-SPECIFICDATE-pages-articles.xml.bz2' file and I would need to use that as my source for single article fetching. Can anyone give me some pointers as to properly achieve this or, even better, point at some documentation? I couldn't find any!
(p.s. if you got any better and properly doc'd lib, please tell me)
Not sure if I understand the question, but if you have the Wikipedia dump and you need to parse the wikicode, I would suggest mwparserfromhell lib.
Another powerful framework is Pywikibot, that is the historic framework for bot users on Wikipedia (thus, it has many scripts dedicated to writing pages, instead of reading and parsing articles). It has a lot of documentation (though, sometimes obsolete) and it uses MediaWiki API.
You can use them both, of course: PWB for fetching articles and mwparserfromhell for parsing.

python twitter libraries - which one to use? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I want to extract information from Twitter using Python. There is a page with several twitter libraries:
https://dev.twitter.com/docs/twitter-libraries
Now I am not sure which one to use. Does anyone has experiences with them? I tried Twython and it is easy to handle but are the others worth a try too?
Tweepy is worth a go as it quite conveniently wraps the User, Tweet, Status etc... into easily accessible Python objects. The other one I've used is plain python-twitter, which does make interfacing easier, but is closer to raw bones JSON queries, and you have to remember the correct URLs to query. IIRC, both support standard interactions and the ability to use the tracking and searching engines.
Have a look on http://pypi.python.org/pypi?%3Aaction=search&term=twitter&submit=search for other options - but it really depends what you want to do and how. I've had success with the aforementioned but I can't recommend any ideal Python twitter library.
I have spent the past few weeks working with Twitter-Python -- the one hosted through Google. I really like the library, although I have not used tweepy or twython. I am pretty new to Python and I have find using Twitter-Python to be very intuitive and easy -- the library has tons of methods to return pretty much any data point you want. It can be a bit daunting since it's a lot of code, but once you get familiar with the library I think it's really simple to use. As the other users have noted though, Twitter-Python doesn't have a streaming API (as far as I know), so if you need that use tweepy.
Depends on what exactly you want to do. Twython is good and is regularly updated as well. Besides that,
Python Tweeter should take care of most of the stuff you would want to achieve. Plus, it's easy and fun to use.
If you need some sort of streaming API support, Tweepy is what you should go for.

Python test "framework" for testing input files and their outputs [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I'm writing end-to-end tests for my tool, which is written in Python. The tool reads a file as input. I want to test its exit code, and its output.
This is a fairly common idiom, and I've seen it done in several ways. In the PHP project, each test is a file, and has lines like: INPUT:, EXPECTED:, EXPECTED_REGEX:, etc. In my own phc project, each file is a normal source file, but with a comment added to the top, which includes keywords like EXPECTED. I think I had copied that off gcc which uses a much more complex tool written in tcl.
Are there frameworks, libraries, etc, that do this in Python? It should:
read the source file
parse special keywords (or similar) corresponding to expected output, exit code, words/regexes it expects to find or not find,
check that the output is correct.
While it doesn't seem hard in theory, I recall lots of edge-cases (esp involving escaping) when implementing this before, and would rather not reinvent the wheel.
The robot framework might be helpful. It is a keyword driven functional testing tool implemented in python and can be extended with pythion or java.
see: http://robotframework.googlecode.com/svn/tags/robotframework-2.5.4/doc/userguide/RobotFrameworkUserGuide.html
There are a number of built in libraries that you might be able to apply to solve your problem, including a OperatingSystem library for working with files etc. and a Strings library for working with strings:
http://robotframework.googlecode.com/svn/tags/robotframework-2.5.4/doc/userguide/RobotFrameworkUserGuide.html#standard-libraries
There is also a http://pythonpaste.org/scripttest/ library by Ian Bicking.
Since the implementation of file io is system dependent, why not mock out the file reading and writing using StringIO:
http://docs.python.org/library/stringio.html
and then test the bulk of the logic (reading from a file, doing some stuff, writing to a file) in python?
Then, perhaps you could have one end to end test for basic sanity by having a separate python file call out to the script using the commands module or something similar where you are calling out to it as another process:
http://docs.python.org/library/commands.html
Using that you could get both the output, and the status.

What is a good configuration file library for c thats not xml (preferably has python bindings)? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I am looking for a good config file library for c that is not xml. Optimally I would really like one that also has python bindings. The best option I have come up with is to use a JSON library in both c and python. What would you recommend, or what method of reading/writing configuration settings do you prefer?
YaML :)
If you're not married to Python, try Lua. It was originally designed for configuration.
You could use a pure python solution like ConfigObj and then simply use the CPython API to query for settings. This assumes that your application embeds Python. If it doesn't, and if you are shipping Python anyway, it might make sense to just embed it. Your C .exe won't get that much bigger if it's a dynamic link, and you will have all the flexibility of Python at your disposal.
Despite being hated by techies and disowned by Microsoft, INI files are actually quite popular with users, as they are easy to understand and edit. They are also very simple to write parsers for, should your libraries not already support them.

Categories