Python library for accessing local wikipedia? - python

I am trying to do some research on the wikipedia data, I am good at Python.
I came across this library, seems nice: https://pypi.python.org/pypi/wikipedia/
I don't want to hit wikipedia directly as this is slow, and also I am trying to access a lot of data and might run into their API limits.
Can I somehow hack this to make it access a local instance of wikipedia data. I know I can run a whole wikipedia server and try to do that, but that seems a round about way.
Is there a way to just point to the folder and get this library to work as it does. Or are you aware of any other libraries that do this?
thank you.

I figured out what I need. I think I shouldn't be searching for API, what I am looking for is a parser. Here are a couple options I have narrowed down so far. Both seem like solid starting points.
wikidump:
https://pypi.python.org/pypi/wikidump/0.1.2
mwlib:
https://pypi.python.org/pypi/mwlib/0.15.14
Update: While these are good parsers for wikipedia data, I found them too limiting in one way or the other, not to mention the lack of documentation. So I eventually went with good old python ElementTree and directly work with the XML.

Related

How to implement my own oauth solution? How would I do it in python?

I've read through multiple articles online, but I can't seem to find any way to properly implement my own oauth solution in python.
I'm using the tornado framework for python and I ended up converting python-oauth2's tornado solution to their most modern API with the hopes of understanding what makes up oauth2 work. I even made a contribution to the github. The biggest problem is that even after reading a large portion of the original oauth documentation, I don't understand everything. It leaves me confused when trying to make a simple oauth log-in.
A place I've looked are:
http://www.slideshare.net/leahculver/implementing-oauth -- Good slides, but since the speaker explained a lot in person of it I'm lost on some generic parts.
If you want to skip the reading, allow me to summarize:
I need a simple, yet well thought out explanation of how O-Auth 2.0 works, and how I could implement my own O-Auth 2.0.
I would like to know a good way to use python to do it.

How does one use Python to access and take data from a DB?

I am specifically referring to Aroon Swartz case, he did a program to access academical DB's such as JSTOR, IEEE, ACM, elsevier, etc. efficiently, any idea of how he did that. I mean what python libraries, or a general algorithm or an explanation of how does it work, or a reference to go deeper on that. Please, don't be shy using advanced concepts.
Thank you very much.
Aroon Swartz didn't really access to JSTOR's DB. He writed a script which find URL of DB's ressources.
You can make that, with urllib2 and pygoogle. Open URL and parse it find what you want.

Posting data to another Site with Django

I would appreciate some help here.
Google checkout has many ways to send it checkout data. I am using the XML server-to-server.
I have everything ready and now I want to throw some xml at google. I have been doing some reading and I know of a couple of ways to do this, one with urllib, another with pyCurl, but I am using django over here and I searched the Django api for some way to POST data to another site and I havent fallen upon anything. I really would like to use the django way, if there is one because I feel it would be more fluid and right, but if you all don't know of any way I will probably use urllib.
urllib2 is the appropriate way to post data if you're looking for python standard library. Django doesn't provide a specific method to do this (as well as it shouldn't). Django goes out of it's way to not simply reinvent tools that already exist in the the standard library (except email...), so you should never really fear using something out of the python standard library.
requests is also great, but not standard library. Nothing wrong with that though.

Does one often use libraries outside the standard ones?

I am trying to learn Python and referencing the documentation for the standard Python library from the Python website, and I was wondering if this was really the only library and documentation I will need or is there more? I do not plan to program advanced 3d graphics or anything advanced at the moment.
Edit:
Thanks very much for the responses, they were very useful. My problem is where to start on a script I have been thinking of. I want to write a script that converts images into a web format but I am not completely sure where to begin. Thanks for any more help you can provide.
For the basics, yes, the standard Python library is probably all you'll need. But as you continue programming in Python, eventually you will need some other library for some task -- for instance, I recently needed to generate a tone at a specific, but differing, frequency for an application, and pyAudiere did the job just right.
A lot of the other libraries out there generate their documentation differently from the core Python style -- it's just visually different, the content is the same. Some only have docstrings, and you'll be best off reading them in a console, perhaps.
Regardless of how the other documentation is generated, get used to looking through the Python APIs to find the functions/classes/methods you need. When the time comes for you to use non-core libraries, you'll know what you want to do, but you'll have to find how to do it.
For the future, it wouldn't hurt to be familiar with C, either. There's a number of Python libraries that are actually just wrappers around C libraries, and the documentation for the Python libraries is just the same as the documentation for the C libraries. PyOpenGL comes to mind, but it's been a while since I've personally used it.
As others have said, it depends on what you're into. The package index at http://pypi.python.org/pypi/ has categories and summaries that are helpful in seeing what other libraries are available for different purposes. (Select "Browse packages" on the left to see the categories.)
One very common library, that should also fit your current needs, is the Python Image Library (PIL).
Note: the latest version is still in beta, and available only at Effbot site.
If you're just beginning, all you'll need to know is the stuff you can get from the Python website. Failing that a quick Google is the fastest way to get (most) Python answers these days.
As you develop your skills and become more advanced, you'll start looking for more exciting things to do, at which point you'll naturally start coming across other libraries (for example, pygame) that you can use for your more advanced projects.
It's very hard to answer this without knowing what you're planning on using Python for. I recommend Dive Into Python as a useful resource for learning Python.
In terms of popular third party frameworks, for web applications there's the Django framework and associated documentation, network stuff there's Twisted ... the list goes on. It really depends on what you're hoping to do!
Assuming that the standard library doesn't provide what we need and we don't have the time, or the knowledge, to implement the code we reuse 3rd party libraries.
This is a common attitude regardless of the programming language.
If there's a chance that someone else ever wanted to do what you want to do, there's a chance that someone created a library for it. A few minutes Googling something like "python image library" will find you what you need, or let you know that someone hasn't created a library for your purposes.

Are there any examples on python-purple floating around?

I want to learn it but I have no idea where to start. Everything out there suggests reading the libpurple source but I don't think I understand enough c to really get a grasp of it.
There isn't much about it yet... the intro, the howto, and the sources (here browsing them online but of course you can git clone them) are about it. In particular, the tiny example client you can get from here does have some miniscule example of use of purple's facilities (definitely not enough, but maybe it can get you started with the help of some 'dir', 'help' and the like...?)
Not sure how much help this will be but based on information from here, it seems like you just install python-purple and import and call the functions as normal Python functions.
Can't help you with a concrete example as I decided to use something else. However, one of the first things I wanted to do after I cloned the repo was remove the ecore dependency. Here's a patch submitted to the mailing list to do just that: https://garage.maemo.org/pipermail/python-purple-devel/2009-March/000000.html
Incidentally, if you're looking for AIM take a look at twisted.words. For Yahoo, trying getting the source for curphoo or zinc (both are console YMSG clients). For GTalk/Jabber, I've had good experiences with xmpppy.

Categories