In bash I can retrieve a json blob of CouchDB replication status docs easily:
curl -X GET -u admin:password localhost:5984/_scheduler/docs/_replicator
Is it possible to retrieve the same information in Python using the couchdb library? I've tried this:
couch_db.view("_scheduler/docs/_replicator", include_docs=True)
.. but this returns a ('not_found', 'missing') error.
It turns out that I was making this more complicated than required. Using the Python requests library makes this task trivial:
requests.get("localhost:5984/_scheduler/docs/_replicator", auth=("admin", "password"))
Related
I need to create a bash(or python) script which gives me the availability status of multiple databases which are on different servers. I found that I can get the status using this url "http://marklogic:8002/manage/v2/database/$DBNAME/?view=status". But I have for about twenty different DBs. When you open this link it generates an xml with database details. Can you please advise how can I loop all the links and grep only the status row ? Or if you have any other idea please advise
It might be worth looking into the MarkLogic Python API project on Github:
https://github.com/marklogic/python_api
HTH!
You can keep the dbnames in a file and then use for loop around it.
for a in `cat dbname.txt`
do
status = `wget -qO- "http://marklogic:8002/manage/v2/database/${a}/?view=status"`
echo $a, $status
done
yep, I did it through curl --anyauth --user user:pass "http://marklogic:8002/manage/v2/database/${a}/?view=status
I am newbie in the real-time distributed search engine elasticsearch, but I would like to ask a technical question.
I have written a python module-crawler that parses a web page and creates JSON objects with native information. The next step for my module-crawler is to store the native information, using the elasticsearch.
The real question is the following.
Which technique is better for my occasion? The elasticsearch RESTful API or the python API for elastic search (elasticsearch-py) ?
If you already have Python code, then the most natural way for you would be to use the elasticsearch-py client.
After installing the elasticsearch-py library via pip install elatsicsearch, you can find a simple code example to get you going:
# import the elasticsearch library
from elasticsearch import Elasticsearch
# get your JSON data
json_page = {...}
# create a new client to connect to ES running on localhost:9200
es = Elasticsearch()
# index your JSON data
es.index(index="webpages", doc_type="webpage", id=1, body=json_page)
You may also try elasticsearch_dsl it is a high level wraper of elasticsearch.
I have a flask server up and running on pythonanywhere and I am trying to write a python script which I can run locally which will trigger a particular response - lets say the server time, for the sake of this discussion.
There is tonnes and tonnes of documentation on how to write the Flask server side of this process, but non/very little on how to write something which can trigger the Flask app to run.
I have tried sending XML in the form of a simple curl command e.g.
curl -X POST -d '<From>Jack</From><Body>Hello, it worked!</Body>' URL
But this doesnt seem to work (errors about referral headers).
Could someone let me know the correct way to compose some XML which can be sent to a listening flask server.
Thanks,
Jack
First, i would add -H "Content-Type: text/xml" to the headers in the cURL call so the server knows what to expect. It would be helpful if you posted the server code (not necessarily everything, but at least what's failing).
To debug this i would use
#app.before_request
def before_request():
if True:
print "HEADERS", request.headers
print "REQ_path", request.path
print "ARGS",request.args
print "DATA",request.data
print "FORM",request.form
It's a bit rough, but helps to see what's going on at each request. Turn it on and off using the if statement as needed while debugging.
Running your request without the xml header in the cURL call sends the data to the request.form dictionary. Adding the xml header definition results in the data appearing in request.data. Without knowing where your server fails, the above should give you at least a hint on how to proceed.
EDIT referring to comment below:
I would use the excellent xmltodict library. Use this to test:
import xmltodict
#app.before_request
def before_request():
print xmltodict.parse(request.data)['xml']['From']
with this cURL call:
curl -X POST -d '<xml><From>Jack</From><Body>Hello, it worked!</Body></xml>' localhost:5000 -H "Content-Type: text/xml"
'Jack' prints out without issues.
Note that this call has been modified from your question- the 'xml' tag has been added since XML requires a root node (it's called an xml tree for a reason..). Without this tag you'll get a parsing error from xmltodict (or any other parser you choose).
I'm looking to push json packets to xively using this command:
jstr = '''{ "version":"1.0.0", "datastreams":[{"id":"one","current_value":"100.00"},{"id":"two","current_value":"022.02"},{"id":"three","current_value":"033.03"},{"id":"four","current_value":"044.04"}] }'''
and running it by calling this:
cmd = "curl --request PUT %s --header %s --verbose %s" % (jstr,apiKey,feedId)
I'm doing it this way so I can manipulate the JSON between transmissions (I change it to a dict and back).
It's throwing an error saying that there is no data sent. I'm new to curl, xively and python so it's really confusing me. Any help would be greatly appreciated please.
The best way to achieve this will be using official Python module provided by Xively.
Here are few reasons for not doing it the way you just described:
the official library provides a nice and simple API
you don't need to care what the data format actually is
calling curl command to make an HTTP request every time is absolutely inefficient as it take time for the OSto spawn and new process
Let's say I have some data in a CouchDB database. The overall size is about 100K docs.
I have a _design doc which stores a 'get all entities' view.
Assuming the requests are done on a local machine against a local database:
via curl: curl -X GET http://127.0.0.1/mydb/_design/myexample/_view/all
via Couchdbkit: entities = Entity.view('mydb/all’)
Does 1 have to perform any additional calculations compared to 2 (JSON encoding/decoding, HTTP request parsing, etc.) and how can that affect the performance of querying 'all' entities from the database?
I guess that directly querying the database (option 2) should be faster than wrapping request/response into JSON, but I am not sure about that.
Under the API covers, Couchdbkit uses the restkit package, which is a REST library.
In other words, Couchdbkit is a pythonic API to the CouchDB REST API, and will do the same amount of work as using the REST API yourself.