Does Stackexchange Python API provide advance filtering support?
For example:
Return all the questions under tag python and javascript with more than 50 upvotes.
Return all the questions that has some substring matched in "title" or in "content".
Include/Exclude filters on different properties.
Reference to official document is really appreciated.
See the official API docs, the API does not well support complex queries directly, but the /search/advanced route does relay much of the power of the web site's search feature.
So:
"Return all the questions under tag python and javascript with more than 50 upvotes."
Use the /search/advanced route.
Pass python;javascript in the tagged parameter.
Pass score:50 in the q parameter.
Live example.
In that library, the equivalent call should be something like:
.fetch('search/advanced', tagged='python;javascript', q='score:50')
For that particular query, this would probably also work:
.fetch('questions', tagged='python;javascript', min='50', sort='votes')
"Return all the questions that have some substring matched in "title" or in "content"."
Put the word in the q parameter. For example:
/search/advanced?q=flask score:50&tagged=javascript
Compare this to the use of the title parameter, which uses AND logic:
/search/advanced?q=score:50&title=flask&tagged=javascript
"Include/Exclude filters on different properties."
That is rather vague. If you mean that you want to exclude questions that have a term, Then...
/search/advanced provides the nottagged parameter.
The q parameter will take some - terms just like the site search. For example"
/search/advanced?q=-flask score:50&tagged=python;javascript
Notes:
The q parameter accepts much of the question-related parameters of the site's web search.
The OP states he is using this library, which has broad support for
the Stack Exchange API (version 2.2).
See the customary use of the term "filtering".
Related
I've looked around for a little while now and can't seem to find anything that even touches on the differences. As the title states, I'm trying to find out what difference getting your data via url path parameters like /content/7 then using regex in your urls.py, and getting them from query params like /content?num=7 using request.GET.get() actually makes.
What are the pros and cons of each, and are there any scenarios where one would clearly be a better choice than the other?
Also, from what I can tell, the (Django's) preferred method seems to be using url path params with regex. Is there any reason for this, other than potentially cleaner URLs? Any additional information pertinent to the topic is welcome.
This would depend on what architectural pattern you would like to adhere to. For example, according to the REST architectural pattern (which we can argue is the most common), you want do design URLs such that without query params, they point to "resources" which roughly correspond to nouns in your application and then HTTP verbs correspond to actions you can perform on that resource.
If, for instance, your application has users, you would want to design URLs like this:
GET /users/ # gets all users
POST /users/ # creates a new user
GET /users/<id>/ # gets a user with that id. Notice this url still points to a user resource
PUT /users/<id> # updates an existing user's information
DELETE /users/<id> # deletes a user
You could then use query params to filter a set of users at a resource. For example, to get users that are active, your URL would look something like
/users?active=true
So to summarize, query params vs. path params depends on your architectural preference.
A more detailed explanation of REST: http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
Roy Fielding's version if you want to get really academic: http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
I previously asked a question here at stackoverflow about how to limit results, using temporary (JavaScript) functions. But, since I got no answer, I have to switch to predefined views. Scanning several pages in google, the only example I found is this one:
def fun(doc):
if "name" in doc:
yield doc['name'], None
But, unfortunately, the demonstration of this view is not accompanied by an example of its usage. So, how does one actually use views to query results in CouchDb, and, no less important, how does one limit results. In SQL world I would formulate my query like SELECT * FROM TABLE_NAME WHERE FIELD1 = "VALUE" LIMIT 1. How to implement the similar thing in CouchDb?
PS. I know documentation exists and possibly at some page there is an answer. But I it's hard to find. And besides I feel a lack of tiny simple examples like SELECT ... WHERE ... LIMIT ....
EDIT
This documentation in section 2.3 also gives an example of view, but does not give an example of its usage. So, if there is anybody in the world, who actually knows how to use views in CouchDB? The sole knowledge of views existance is not useful at all.
You also have the ?limit= as query parameter available.
There are a bunch of other useful parameters: http://docs.couchdb.org/en/1.6.1/api/ddoc/views.html
I would like to develop a system in python to get the position and number of the results of keywords in google.
I tried using the Google API, however, I found that with the use of custom search engine and I have different results with fewer pages compared to "official" google. I've tried a variety of ready-made modules (eg: xgoogle, pattern, etc.).
I tried scraping but I do not think is the right way.
My question is: what is the best approach to achieve the results required by me without scraping? It is not possible to have them?
Always use web services APIs. If a site does not have one, and you are not the site's owner, then that is an indication that they do not want you to use automatic tools to fetch their data.
Additionally, fetching a keyword rank is dubious at best. Google adjusts its rank on many factors, including your search history, your location, your locale, and any number of factors that are trade secrets. Using the API will be the most generic results, even if they don't match the ones you get when you search from Firefox or Chrome.
TL;DR Your best bet are the methods you already don't like.
I'm using django-haystack at the moment
with apache-solr as the backend.
Problem is I cannot get the app to perform the search functionality I'm looking for
Searching for sub-parts in a word
eg. Searching for "buntu" does not give me "ubuntu"
Searching for similar words
eg. Searching for "ubantu" would give "ubuntu"
Any help would be very much appreciated.
This is really about how you pass the query back to Haystack (and therefore to Solr). You can do a 'fuzzy' search in Solr/Lucene by using a ~ after the word:
ubuntu~
would return both buntu and ubantu. See the Lucene documentation on this.
How you pass this through via Haystack depends on how you're using it at the moment. Assuming you're using the default SearchForm, the best thing would be to either override the form's clean_q method to add the tilde on the end of every word in the search results, or override the search method to do the same thing there before passing it to the SearchQuerySet.
Is it built-in in Sphinx?
It look like Sphinx contains own search engine for English language. See http://sphinx.pocoo.org/_static/searchtools.js and searchindex.js/.json (see Sphinx docs index 36Kb, Python docs index 857Kb, and Grok docs 37Kb).
Index is being precomputed when docs are generated.
When one searches, static page is being loaded and then _static/searchtools.js extract search terms from query string, normalizes (case, stemming, etc.) them and looks up in searchindex.js as it is being loaded.
First search attempt takes rather long time, consecutive are much faster as index is cached in your browser.
The Sphinx search engine is built in Javascript. It uses JQuery and a (sometimes very big) javascript file containing the search terms.
Yes. Sphinx is not built-in, however. The search widget is part of sphinx. What context did you mean by "built-in"?
On the page iteself: http://docs.python.org/about.html
http://sphinx.pocoo.org/