Python - best way to wait until a remote system is booted - python

I am using wake on lan to start a certain server in a python script.
The server is online when I can do a successfull API request, such as:
return requests.get(
url + path,
auth=('user', user_password),
headers={'Content-Type':'application/json'},
verify=False,
timeout=0.05
).json()
What is the best method to wait for the server bootup process (until it is reachable via API) without spamming the network with requests in a loop?

I believe you're very close. Why not put that request in while a d try except blocks?
while True:
try:
return requests.head(...)
except requests.exceptions.ConnectionError:
time.sleep(0.5)

Your two choices are to poll the remote service until it responds or configure the service to in some way notify you that it's up.
There's really nothing wrong with sending requests in a loop, as long as you don't do so unnecessarily often. If the service takes ~10 seconds to come up, checking once a second would be reasonable. If it takes ~10 minutes, every 30 seconds or so would probably be fine.
The alternative - some sort of push notification - is more elegant, but it requires you having some other service up and running already, listening for the notification. For example you could start a simple webserver locally before restarting the remote service and have the remote service make a request against your server when it's ready to start handling requests.
Generally speaking I would start with the polling approach since it's easier and involves fewer moving parts. Just be sure you design your polling in a fault-tolerant way; in particular be sure to specify a maximum time to wait or number of polling attempts to make before giving up. Otherwise your script will just hang if the remote service never comes up.

Related

Flask-Sockets keepalive

I recently started using flask-sockets in my flask application with native WebSocket API as client. I would like to know if there is proper way to send ping requests at certain intervals from the server as keepalive?
When going through the geventwebsocket library, I noticed the definition handle_ping(...), but it's never called. Is there a way to determine a ping interval on WS?
I see my sockets dying after a minute and a half inconsistently sometimes.
#socket_blueprint.route('/ws', defaults={'name':''})
def echo_socket(ws):
while not ws.closed:
ws_list.append(
msg = ws.receive()
ws.send(msg)
I could probably spin up a separate thread and send ping opcodes manually every 30 seconds to the clients if I keep them in a list, but I feel like there'd be a better way to handle that..
In service, create a thread in this thread send some data(any data) to client. If client already disconnected,after 15s the server will receive closed.
I haven't find any method about ping in gevent websocket or flask-sockets. So take this method.

Which web servers are compatible with gevent and how do the two relate?

I'm looking to start a web project using Flask and its SocketIO plugin, which depends on gevent (something something greenlets), but I don't understand how gevent relates to the webserver. Does using gevent restrict my server choice at all? How does it relate to the different levels of web servers that we have in python (e.g. Nginx/Apache, Gunicorn)?
Thanks for the insight.
First, lets clarify what we are talking about:
gevent is a library to allow the programming of event loops easily. It is a way to immediately return responses without "blocking" the requester.
socket.io is a javascript library create clients that can maintain permanent connections to servers, which send events. Then, the library can react to these events.
greenlet think of this a thread. A way to launch multiple workers that do some tasks.
A highly simplified overview of the entire process follows:
Imagine you are creating a chat client.
You need a way to notify the user's screens when anyone types a message. For this to happen, you need someway to tell all the users when a new message is there to be displayed. That's what socket.io does. You can think of it like a radio that is tuned to a particular frequency. Whenever someone transmits on this frequency, the code does something. In the case of the chat program, it adds the message to the chat box window.
Of course, if you have a radio tuned to a frequency (your client), then you need a radio station/dj to transmit on this frequency. Here is where your flask code comes in. It will create "rooms" and then transmit messages. The clients listen for these messages.
You can also write the server-side ("radio station") code in socket.io using node, but that is out of scope here.
The problem here is that traditionally - a web server works like this:
A user types an address into a browser, and hits enter (or go).
The browser reads the web address, and then using the DNS system, finds the IP address of the server.
It creates a connection to the server, and then sends a request.
The webserver accepts the request.
It does some work, or launches some process (depending on the type of request).
It prepares (or receives) a response from the process.
It sends the response to the client.
It closes the connection.
Between 3 and 8, the client (the browser) is waiting for a response - it is blocked from doing anything else. So if there is a problem somewhere, like say, some server side script is taking too long to process the request, the browser stays stuck on the white page with the loading icon spinning. It can't do anything until the entire process completes. This is just how the web was designed to work.
This kind of 'blocking' architecture works well for 1-to-1 communication. However, for multiple people to keep updated, this blocking doesn't work.
The event libraries (gevent) help with this because they accept and will not block the client; they immediately send a response and when the process is complete.
Your application, however, still needs to notify the client. However, as the connection is closed - you don't have a way to contact the client back.
In order to notify the client and to make sure the client doesn't need to "refresh", a permanent connection should be open - that's what socket.io does. It opens a permanent connection, and is always listening for messages.
So work request comes in from one end - is accepted.
The work is executed and a response is generated by something else (it could be a the same program or another program).
Then, a notification is sent "hey, I'm done with your request - here is the response".
The person from step 1, listens for this message and then does something.
Underneath is all is WebSocket a new full-duplex protocol that enables all this radio/dj functionality.
Things common between WebSockets and HTTP:
Work on the same port (80)
WebSocket requests start off as HTTP requests for the handshake (an upgrade header), but then shift over to the WebSocket protocol - at which point the connection is handed off to a websocket-compatible server.
All your traditional web server has to do is listen for this handshake request, acknowledge it, and then pass the request on to a websocket-compatible server - just like any other normal proxy request.
For Apache, you can use mod_proxy_wstunnel
For nginx versions 1.3+ have websocket support built-in

make HTTP request from python and wait a long time for a response

I'm using Python to to access a REST API that sometimes takes a long time to run (more than 5 minutes). I'm using pyelasticsearch to make the request, and tried setting the timeout to 10 minutes like this:
es = ElasticSearch(config["es_server_url"], timeout=600)
results = es.send_request("POST",
[config["es_index"], "_search_with_clusters" ],
cluster_query)
but it times out after 5 minutes (not 10) with requests.exceptions.ConnectionError (Caused by <class 'socket.error'>: [Errno 104] Connection reset by peer)
I tried setting the socket timeout and using requests directly like this:
socket.setdefaulttimeout(600)
try:
r = requests.post(url, data=post, timeout=600)
except:
print "timed out"
and it times out after approximately 5 minutes every time.
How can I make my script wait longer until the request returns?
The err "Connection reset by peer", aka ECONNRESET, means that the server—or some router or proxy between you and the server—closed the connection forcibly.
So, specifying a longer timeout on your end isn't going to make any difference. You need to figure out who's closing the connection and configure it to wait longer.
Plausible places to look are the server application itself, whatever server program drives that application (e.g., if you're using Apache with mod_wsgi, Apache), a load-balancing router or front-end server or reverse proxy in front of that server, or a web proxy in front of your client.
Once you figure out where the problem is, if it's something you can't fix yourself, you may be able to fix it by trickling from the server to the client—have it send something useless but harmless (an HTTP 100, an extra header, some body text that your client knows how to skip over, whatever) every 120 seconds. This may or may not work, depending on what component is hanging up.

Python Requests Not Cleaning up Connections and Causing Port Overflow?

I'm doing something fairly outside of my comfort zone here, so hopefully I'm just doing something stupid.
I have an Amazon EC2 instance which I'm using to run a specialized database, which is controlled through a webapp inside of Tomcat that provides a REST API. On the same server, I'm running a Python script that uses the Requests library to make hundreds of thousands of simple queries to the database (I don't think it's possible to consolidate the queries, though I am going to try that next.)
The problem: after running the script for a bit, I suddenly get a broken pipe error on my SSH terminal. When I try to log back in with SSH, I keep getting "operation timed out" errors. So I can't even log back in to terminate the Python process and instead have to reboot the EC2 instance (which is a huge pain, especially since I'm using ephemeral storage)
My theory is that each time requests makes a REST call, it activates a pair of ports between Python and Tomcat, but that it never closes the ports when it's done. So python keeps trying to grab more and more ports and eventually either somehow grabs away and locks the SSH port (booting me off), or it just uses all the ports and that causes the network system to crap out somehow (as I said, I'm out of my depth.)
I also tried using httplib2, and was getting a similar problem.
Any ideas? If my port theory is correct, is there a way to force requests to surrender the port when it's done? Or otherwise is there at least a way to tell Ubuntu to keep the SSH port off-limits so that I can at least log back in and terminate the process?
Or is there some sort of best practice to using Python to make lots and lots of very simple REST calls?
Edit:
Solved...do:
s = requests.session()
s.config['keep_alive'] = False
Before making the request to force Requests to release connections when it's done.
My speculation:
https://github.com/kennethreitz/requests/blob/develop/requests/models.py#L539 sets conn to connectionpool.connection_from_url(url)
That leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L562, which leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L167.
This eventually leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L185:
def _new_conn(self):
"""
Return a fresh :class:`httplib.HTTPConnection`.
"""
self.num_connections += 1
log.info("Starting new HTTP connection (%d): %s" %
(self.num_connections, self.host))
return HTTPConnection(host=self.host, port=self.port)
I would suggest hooking a handler up to that logger, and listening for lines that match that one. That would let you see how many connections are being created.
Figured it out...Requests has a default 'Keep Alive' policy on connections which you have to explicitly override by doing
s = requests.session()
s.config['keep_alive'] = False
before you make a request.
From the doc:
"""
Keep-Alive
Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session! Any requests that you make within a session will automatically reuse the appropriate connection!
Note that connections are only released back to the pool for reuse once all body data has been read; be sure to either set prefetch to True or read the content property of the Response object.
If you’d like to disable keep-alive, you can simply set the keep_alive configuration to False:
s = requests.session()
s.config['keep_alive'] = False
"""
There may be a subtle bug in Requests here because I WAS reading the .text and .content properties and it was still not releasing the connections. But explicitly passing 'keep alive' as false fixed the problem.

How can i ignore server response to save bandwidth?

I am using a server to send some piece of information to another server every second. The problem is that the other server response is few kilobytes and this consumes the bandwidth on the first server ( about 2 GB in an hour ). I would like to send the request and ignore the return ( not even receive it to save bandwidth ) ..
I use a small python script for this task using (urllib). I don't mind using any other tool or even any other language if this is going to make the request only.
A 5K reply is small stuff and is probably below the standard TCP window size of your OS. This means that even if you close your network connection just after sending the request and checking just the very first bytes of the reply (to be sure that request has been really received) probably the server already sent you the whole answer and the packets are already on the wire or on your computer.
If you cannot control (i.e. trim down) what is the server reply for your notification the only alternative I can think to is to add another server on the remote machine waiting for a simple command and doing the real request locally and just sending back to you the result code. This can be done very easily may be even just with bash/perl/python using for example netcat/wget locally.
By the way there is something strange in your math as Glenn Maynard correctly wrote in a comment.
For HTTP, you can send a HEAD request instead of GET or POST:
import urllib2
request = urllib2.Request('https://stackoverflow.com/q/5049244/')
request.get_method = lambda: 'HEAD' # override get_method
response = urllib2.urlopen(request) # make request
print response.code, response.url
Output
200 https://stackoverflow.com/questions/5049244/how-can-i-ignore-server-response-t
o-save-bandwidth
See How do you send a HEAD HTTP request in Python?
Sorry but this does not make much sense and is likely a violation of the HTTP protocol. I consider such an idea as weird and broken-by-design. Either make the remote server shut up or configure your application or whatever is running on the remote server on a different protocol level using a smarter protocol with less bandwidth usage. Everything else is hard being considered as nonsense.

Categories