I have a custom HTTP method/verb (lets say LISTEN) which allows me to listen for an update on a resource stored on a remote server. The API available for this has a blocking call which will get my client code to listen for an update till I interrupt the execution of that call. Just to provide an example, if I were to perform a curl as follows:
curl -X LISTEN http://<IP-Address>:<Port>/resource
The execution of this creates a blocking call, providing me updates on the resource whenever a new value for this resource is pushed to the server (similar to a pub-sub model), the response for that would look similar to this:
{"data":"value update 1","id":"id resource"}
{"data":"value update 2","id":"id resource"}
(...)
If I were to write code to handle this in Python, how do I call my url using this custom verb and handle the blocking call/call back while ensuring that this does not block the execution of the rest of my code?
If you're using Python requests lib with a custom HTTP verb and need to read stream content, you can do something like this:
import json
import requests # sudo pip3 install requests
url = "http://........."
r = requests.request('LISTEN', url, stream=True)
for line in r.iter_lines():
# filter out keep-alive new lines
if line:
decoded_line = line.decode('utf-8')
print(json.loads(decoded_line))
Note: by default all requests calls are blocking, so you need to run this code in a separate thread/process to avoid that.
...while ensuring that this does not block the execution of the rest of my code
Since you provided no details about your application, I will try to list some general thoughts on question.
Your task can be solved in many ways. Solution depends on your app architecture.
If this is a web server, you can take a look at tornado(see streaming callback) or aiohttp streaming examples.
On the other hand you can run the code above in a separate process and communicate with other applications/services using RabbitMQ for example (or other ipc mechanism).
Related
I'm very new to web dev, and i'm trying to build a simple Web interface with Ajax calls to refresh data, and turbogears2 as the backend.
My Ajax calls are working fine and makes periodic calls to my Turbogears2 server, however these calls takes time to complete (some requests make the server to use remote SSH calls on other machines, which takes up to 3-4 seconds to complete).
My problem is that TurboGears waits for each request to complete before handling the next one, so all my concurrent Ajax calls are being queued instead of being all processed in parallel.
To refresh N values takes 3*N seconds where it could just take 3 seconds with concurrency.
Any idea how to fix this ?
Here is my current server-side code (method get_load is the one called with Ajax):
class RootController(TGController):
#expose()
def index(self):
with open ("index.html") as data:
index = data.read()
return index
#expose()
def get_load(self, ip):
command = "bash get_cpu_load.sh"
request = subprocess.Popen(["ssh", "-o ConnectTimeout=2", ip, command])
load = str(request.communicate()[0])
return load
Your problem is probably caused by the fact that you are serving requests with Gearbox wsgiref server. By default the wsgiref server is single threaded and so can serve a single request at time. That can be changed by providing the wsgiref.threaded = true configuration option in your development.ini server section (the same where ip address and port are specified too). See https://github.com/TurboGears/gearbox#gearbox-http-servers and http://turbogears.readthedocs.io/en/latest/turbogears/gearbox.html#changing-http-server for additional details.
Note that wsgiref is the development server for TurboGears and usage on production is usually discouraged. You should consider using something like waitress, chaussette or mod_wsgi when deploying your application, see http://turbogears.readthedocs.io/en/latest/cookbook/deploy/index.html?highlight=deploy
I have used python's requests module to do a POST call(within a loop) to a single URL with varying sets of data in each iteration. I have already used session reuse to use the underlying TCP connection for each call to the URL during the loop's iterations.
However, I want to further speed up my processing by 1. caching the URL and the authentication values(user id and password) as they would remain the same in each call 2. Spawning multiple sub-processes which could take a certain number of calls and pass them as a group thus allowing for parallel processing of these smaller sub-processes
Please note that I pass my authentication as headers in base64 format and a pseudo code of my post call would typically look like this:
S=requests.Session()
url='https://example.net/'
Loop through data records:
headers={'authorization':authbase64string,'other headers'}
data="data for this loop"
#Post call
r=s.post(url,data=data,headers=headers)
response=r.json()
#end of loop and program
Please review the scenario and suggest any techniques/tips which might be of help-
Thanks in advance,
Abhishek
You can:
do it as you described (if you want to make it faster then you can run it using multiprocessing) and e.g. add headers to session, not request.
modify target server and allow to send one post request with multiple data (so you're going to limit time spent on connecting, etc)
do some optimalizations on server side, so it's going to reply faster (or just store requests and send you response using some callback)
It would be much easier if you described the use case :)
I am fighting with tornado and the official python oauth2client, gcloud... modules.
These modules accept an alternate http client passed with http=, as long as it has a method called request which can be called by any of these libraries, whenever an http request must be sent to google and/or to renew the access tokens using the refresh tokens.
I have created a simple class which has a self.client = AsyncHttpClient()
Then in its request method, returns self.client.fetch(...)
My goal is to be able to yield any of these libraries calls, so that tornado will execute them in asynchronously.
The thing is that they are highly dependant on what the default client - set to httplib2.Http() returns: (response, content)
I am really stuck and cannot find a clean way of making this async
If anyone already found a way, please help.
Thank you in advance
These libraries do not support asynchronous. The porting process is not always easy.
oauth2client
Depending on what you want to do maybe Tornado's GoogleOAuth2Mixin or tornado-alf will be enough.
gcloud
Since I am not aware of any Tornado/asyncio implementation of gcloud-python, so you could:
you may write it yourself. Again it's not simple transport change of Connection.http or request, all the stuff around must be able to use/yield future/coroutines.
wrap it in ThreadPoolExecutor (as #Apero mentioned). This is high level API, so any nested api calls within that yield will be executed in same thread (not using the pool). It could work well.
external app (with ProcessPoolExecutor or Popen).
When I had similar problem with AWS couple years ago, I've ended up with executing, asynchronously, CLI (Tornado + subprocess.Popen + some cli (awscli, or boto based)) and simple cases (like S3, basic EC2 operations) with plain AsyncHTTPClient.
I am using cherrypy for a web server which is able to stream the output of some methods.
Server uses yield to send lines of data and client uses onprogress event of $.ajax method.
But enabling 'tools.gzip' config of cherrypy caused the output not to be cached by the client. In fact the onprogress event of client is not called unless the server method is finished completely. It seems the cherrypy compression tool is not able to compress the output in streaming mode (it can compress the output only when get it completely).
Now my first question is how to fix this problem. If it is not fixable my second question is how to diable the cherrypy compression for a specific method.
You have to enable the streaming capabilities of the request.
Set the following configuration:
{'response.stream': True}
The gzip tools inspect the current request and look for the stream and response accordingly.
For more information: http://docs.cherrypy.org/en/latest/advanced.html#streaming-the-response-body
I just checked my webspace and it's signature says: Apache/2.2.9 (Debian) mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.9 OpenSSL/0.9.8g
This give me hope that Python is somehow supported. Why is python listed twice? mod_python/3.3.1 AND Python/2.5.2 ???
There is a cgi-bin folder on my webspace.
What I want to do: I need to do a cross-site call to get some text-data from a server. The text-data is not JSON but I guess I should convert it to JSON (or is there an option to do cross-site without JSON?)
The python script gets the request for some JSONP. Depending on the request (I guess I should somehow parse the URL) the python script is to load the a requested text-data file from the webserver and wrap it in some JSON and return it.
Can somebody tell me how I do these three steps with python on my webspace?
First off, the signature isn't listing python twice. Its listing first the version of mod_python, which is an Apache web server plugin, then it is listing the version of the python interpreter on the system.
python cgi module - This is really an inefficient approach to writing python server code, but here it is. Ultimately you should consider one of the many amazing python web frameworks out there. But, using the cgi module, your response would always start with this:
print 'Content-Type: application/json\n\n'
Your python script would run on the server from an HTTP request. In that script you would check the request and determine the data you will want to serve from either the URL value or the query string.
At the very least you would just wrap your return value in a basic JSON data structure. The text data itself can just be a string:
import json
text_data = "FOO"
json_data = json.dumps({'text': text_data})
print json_data
# {"text": "FOO"}
For the JSONP aspect, you would usually check the query string to see if the request contains a specific name for the callback function the client wants, or just default to 'callback'
print "callback(%s);" % json_data
# callback({"text": "FOO"});
Returning that would be a JSONP type response, because when the client receives it, the callback is executed for the client.
And to conclude, let me add that you should be aware that python cgi scripts will need to start a brand new python interpreter process for every single request (even repeat requests from the same client). This can easily overwhelm a server under increased load. For this reason, people usually go with the wsgi route (mod_wsgi in apache). wsgi allows a persistant application to keep running, and handles ongoing requests.