Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I'm searching for an API or a program (preferably Python and open-source) which lets me download the first n pictures of a Google Image Search for let's say bicycles. It would also be helpful if it could download the first n .pdf files from a normal search. Since not all pictures and .pdf files are found on Google and since there are many other search engines, a program which could also scrape results from Yahoo or Bing would be very convenient. Are there any such programs or is there an API from Google which lets me do more than 100 searches a day?
edit: People passing by may want to look at my attempt of programming such a scraper here
According to this post, all Google Search APIs have been deprecated.
However, GoogleScraper, an open-source library can help you achieve what you intend achieving.
If want to go barebones, and implement this yourself, BeautifulSoup is a very nice library to work with.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I stumbled upon the wikidump python library, which I think suits me just fine.
I could get by by looking at the source code, but I'm new at python and I don't want to write BS code as the project I need it for is kind of important to me.
I got the 'wiki-SPECIFICDATE-pages-articles.xml.bz2' file and I would need to use that as my source for single article fetching. Can anyone give me some pointers as to properly achieve this or, even better, point at some documentation? I couldn't find any!
(p.s. if you got any better and properly doc'd lib, please tell me)
Not sure if I understand the question, but if you have the Wikipedia dump and you need to parse the wikicode, I would suggest mwparserfromhell lib.
Another powerful framework is Pywikibot, that is the historic framework for bot users on Wikipedia (thus, it has many scripts dedicated to writing pages, instead of reading and parsing articles). It has a lot of documentation (though, sometimes obsolete) and it uses MediaWiki API.
You can use them both, of course: PWB for fetching articles and mwparserfromhell for parsing.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
Real-time api browser websites like ruby-docs.com and jqapi.com are very useful, it there any similar website for Python?
Updated:
By real-time I means instant search. docs.python.org is well-wriiten but a little hard for searching (comparing with ruby-docs.com and jqapi.co).
Not clear what you mean by real-time API in this respect, a Python API?.
The documentatation at http://docs.python.org is very useful and complete, supports multiple version of the Python language (starting with 2.6) and has search.
The search there is not as interactive e.g. the one on ruby-docs.com.
I use docs.python.org quite often and personally do not miss that interactivity, as my IDE for Python has a better interactive information than a website can provide.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Is there any API for spokeo ? I wanted to get results in json or xml format and I tried finding an api for it but couldn't. Has anyone tried scraping spokeo with or without the api ? I'm sure we can scrape in a general way but I dont know how to proceed when search results come up with more than one location area. Thanks
According to Spokeo's terms of use, scrapers are explicitly prohibited, as are any "derivative works" - even if all such works do is frame content from their site.
If you publish this in a publicly available application, be prepared for some flak for it.
I think an easier answer would be to work with the FullContact Person API
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am using python and the gdata library to parse the info of a youtube video.
My code is this:
yt_service = gdata.youtube.service.YouTubeService()
entry = yt_service.GetYouTubeVideoEntry(video_id='someid')
but in the entry.rating or entry.statistics there is no likes/dislikes
Where can I get that info from?
Once I use Python 3 and the gdata library doesn't have support for it I couldn't reproduce the results.
But as far as I know, the entry.rating returns a xml code with the whole statistic content of the video.
For a more specific result you should try entry.rating.average or entry.rating.num_raters
Looking at the source of the gdata library, it doesn't seem that it supports YouTube's like/dislike Gdata <yt:rating> element, only the generic <gd:rating> element.
If you are able to somehow access the underlying XML element through the library (I haven't used it myself), you should be able to get your hands at the YouTube rating element (qualified name should be {http://gdata.youtube.com/schemas/2007}rating, if that helps :) ).
Even better, if you are able to patch the library to actually natively support that element, all the better. I'm sure the authors would appreciate a patch. :)
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I'm just starting with mwclient. I'm going to create bots to query our MediaWiki database and make small revisions.
But I cannot find anywhere a simple list of python commands like how to get ages of pages, contents of categories, contents of pages, etc.
Does anyone know a good starters resource?
The official docs at https://github.com/mwclient/mwclient/wiki have some introductory tutorials. I'm in charge for documentation for mwclient but haven't had enough time to really expand them - could use help from anyone who is willing.
One of my colleagues just sent me a link to the MediaWiki API wiki page.
I currently use python+urllib for API queries, and mwclient whenever I need to edit/create a page.
An useful place to get started with mwclient (read/edit/create a page):
http://brianna.modernthings.org/article/134/write-api-enabled-on-wikimedia-sites
The Bot Manual also has tons of good info and links, e.g. creating a bot.