Prevent SFTP/SSH session timeout with paramiko - python

I'm using paramiko to connect to an SFTP server on which I have to download and process some files.
The server has a timeout set to 5 minutes, but some days it happens that the processing of the files can take longer than the timeout. So, when I want to change the working directory on the server to process some other files sftp.chdir(target_dir)) I get an exception that the connection has timed out:
File
buildbdist.win32eggparamikosftp://ftp.py,
line 138, in _write_all raise
EOFError()
To counter this I thought that activating the keep alive would be the best option so I used the "set_keepalive" on the transport to set it to 30 seconds:
ssh = paramiko.SSHClient()
ssh.set_missing_hostkey_policy(paramiko.AutoAddPolicy())
ssh.connect(ssh_server, port=ssh_port, username=ssh_user, password=password)
transport = ssh.get_transport()
transport.set_keepalive(30)
sftp = transport.open_sftp_client()
But nothing changes at all. The change has absolutely no effect. I don't know if I'm misunderstanding the concept of set_keepalive here or maybe the server (on which I have no access) ignores the keep alive packets.
Isn't this the right way to counter this problem or should I try a different approach? I don't like the idea of "manually" sending some ls command to the server to keep the session alive.

If the server is timing you out for inactivity, there's not much you can do from the client-side (other than perhaps send a simple command every now and again to keep your session from timing out).
Have you considered breaking apart your download and processing steps, so that you can download everything you need to start with, then process it either asynchronously, or after all downloads have completed?

Related

Why is there a discrepancy between python sockets and tcp ping for the same IP:port destination?

My setup:
I am using an IP and port provided by portmap.io to allow me to perform port forwarding.
I have OpenVPN installed (as required by portmap.io), and I run a ready-made config file when I want to operate my project.
My main effort involves sending messages between a client and a server using sockets in Python.
I have installed a software called tcping, which basically allows me to ping an IP:port over a tcp connection.
This figure basically sums it up:
Results I'm getting:
When I try to "ping" said IP, the average RTT ends up being around 30ms consistently.
I try to use the same IP to program sockets in Python, where I have a server script on my machine running, and a client script on any other machine but binding to this IP. I try sending a small message like "Hello" over the socket, and I am finding that the message is taking a significantly greater amount of time to travel across, and an inconsistent one for that matter. Sometimes it ends up taking 1 second, sometimes 400ms...
What is the reason for this discrepancy?
What is the reason for this discrepancy?
tcpping just measures the time needed to establish the TCP connection. The connection establishment is usually completely done in the OS kernel, so there is not even a switch to user space involved.
Even some small data exchange at the application is significantly more expensive. First, the initial TCP handshake must be done. Usually only once the TCP handshake is done the client starts sending the payload, which then needs to be delivered to the other side, put into the sockets read buffer, schedule the user space application to run, read the data from the buffer in the application and process, create and deliver the response to the peers OS kernel, let the kernel deliver the response to the local system and lots of stuff here too until the local app finally gets the response and ends the timing of how long this takes.
Given that the time for the last one is that much off from the pure RTT I would assume though that the server system has either low performance or high load or that the application is written badly.

How to know if the remote tcp device is powered off

In my GO code, I am establishing a TCP connection as below:
conn, err1 := net.Dial("tcp", <remote_address>)
if err1 == nil {
buf := make([]byte, 256)
text, err := conn.Read(buf[:])
if err == io.EOF {
//remote connection close handle
fmt.Println("connection got reset by peer")
panic(err)
}
}
To simulate the other end, I am running a python script on a different computer, which opens a socket and sends some random data to the socket above lines of codes are listening to. Now my problem is, when I am killing this python code by pressing ctrl+C, the remote connection closed event is recognised finely by above code and I get a chance to handle that.
However, if I simply turn off the remote computer (where the python script is running) my code doesn't get notified at all.
In my case, the connection should always be opened and should be able to send the data randomly, and only if the remote machine gets powered off, my GO code should get notified.
Can someone help me in this scenario, how would I get notification when the remote machine hosting the socket itself gets powered off? How would I get the trigger remotely in my GO code?
PS - This seems to be a pretty common problem in real time, though not in the testing environment.
There is no way to determine the difference between a host that is powered off and a connection that has been broken, so you treat them the same way.
You can send a heartbeat message on your own, and close the connection when you reach some timeout period between heartbeat packets. The timeout can either be set manually by timing the packets, or you can use SetReadDeadline before each read to terminate the connection immediately when the deadline is reached.
You can also use TCP Keepalive to do this for you, using TCPConn.SetKeepAlive to enable it and TCPConn.SetKeepAlivePeriod to set the interval between keepalive packets. The time it takes to actually close the connection will be system dependent.
You should also set a timeout when dialing, since connecting to a down host isn't guaranteed to return an ICMP Host Unreachable response. You can use DialTimeout, a net.Dialer with the Timeout parameter set, or Dialer.DialContext.
Simply reading through the stdlib documentation should provide you with plenty of information: https://golang.org/pkg/net/
You need to add some kind of heartbeat message. Then, looking at GO documentation, you can use DialTimeout instead of Dial, each time you receive the heartbeat message or any other you can reset the timeout.
Another alternative is to use TCP keepalive. Which you can do in Python by using setsockopt, I can't really help you with GO but this link seems like a good description of how to enable keepalive with it:
http://felixge.de/2014/08/26/tcp-keepalive-with-golang.html

Paramiko get stdout from connection object (not exec_command)

I'm writing a script that uses paramiko to ssh onto several remote hosts and run a few checks. Some hosts are setup as fail-overs for others and I can't determine which is in use until I try to connect. Upon connecting to one of these 'inactive' hosts the host will inform me that you need to connect to another 'active' IP and then close the connection after n seconds. This appears to be written to the stdout of the SSH connection/session (i.e. it is not an SSH banner).
I've used paramiko quite a bit, but I'm at a loss as to how to get this output from the connection, exec_command will obviously give me stdout and stderr, but the host is outputting this immediately upon connection, and it doesn't accept any other incoming requests/messages. It just closes after n seconds.
I don't want to have to wait until the timeout to move onto the next host and I'd also like to verify that that's the reason for not being able to connect and run the checks, otherwise my script works as intended.
Any suggestions as to how I can capture this output, with or without paramiko, is greatly appreciated.
I figured out a way to get the data, it was pretty straight forward to be honest, albeit a little hackish. This might not work in other cases, especially if there is latency, but I could also be misunderstanding what's happening:
When the connection opens, the server spits out two messages, one saying it can't chdir to a particular directory, then a few milliseconds later it spits out another message stating that you need to connect to the other IP. If I send a command immediately after connecting (doesn't matter what command), exec_command will interpret this second message as the response. So for now I have a solution to my problem as I can check this string for a known message and change the flow of execution.
However, if what I describe is accurate, then this may not work in situations where there is too much latency and the 'test' command isn't sent before the server response has been received.
As far as I can tell (and I may be very wrong), there is currently no proper way to get the stdout stream immediately after opening the connection with paramiko. If someone knows a way, please let me know.

Python Requests Not Cleaning up Connections and Causing Port Overflow?

I'm doing something fairly outside of my comfort zone here, so hopefully I'm just doing something stupid.
I have an Amazon EC2 instance which I'm using to run a specialized database, which is controlled through a webapp inside of Tomcat that provides a REST API. On the same server, I'm running a Python script that uses the Requests library to make hundreds of thousands of simple queries to the database (I don't think it's possible to consolidate the queries, though I am going to try that next.)
The problem: after running the script for a bit, I suddenly get a broken pipe error on my SSH terminal. When I try to log back in with SSH, I keep getting "operation timed out" errors. So I can't even log back in to terminate the Python process and instead have to reboot the EC2 instance (which is a huge pain, especially since I'm using ephemeral storage)
My theory is that each time requests makes a REST call, it activates a pair of ports between Python and Tomcat, but that it never closes the ports when it's done. So python keeps trying to grab more and more ports and eventually either somehow grabs away and locks the SSH port (booting me off), or it just uses all the ports and that causes the network system to crap out somehow (as I said, I'm out of my depth.)
I also tried using httplib2, and was getting a similar problem.
Any ideas? If my port theory is correct, is there a way to force requests to surrender the port when it's done? Or otherwise is there at least a way to tell Ubuntu to keep the SSH port off-limits so that I can at least log back in and terminate the process?
Or is there some sort of best practice to using Python to make lots and lots of very simple REST calls?
Edit:
Solved...do:
s = requests.session()
s.config['keep_alive'] = False
Before making the request to force Requests to release connections when it's done.
My speculation:
https://github.com/kennethreitz/requests/blob/develop/requests/models.py#L539 sets conn to connectionpool.connection_from_url(url)
That leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L562, which leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L167.
This eventually leads to https://github.com/kennethreitz/requests/blob/develop/requests/packages/urllib3/connectionpool.py#L185:
def _new_conn(self):
"""
Return a fresh :class:`httplib.HTTPConnection`.
"""
self.num_connections += 1
log.info("Starting new HTTP connection (%d): %s" %
(self.num_connections, self.host))
return HTTPConnection(host=self.host, port=self.port)
I would suggest hooking a handler up to that logger, and listening for lines that match that one. That would let you see how many connections are being created.
Figured it out...Requests has a default 'Keep Alive' policy on connections which you have to explicitly override by doing
s = requests.session()
s.config['keep_alive'] = False
before you make a request.
From the doc:
"""
Keep-Alive
Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session! Any requests that you make within a session will automatically reuse the appropriate connection!
Note that connections are only released back to the pool for reuse once all body data has been read; be sure to either set prefetch to True or read the content property of the Response object.
If you’d like to disable keep-alive, you can simply set the keep_alive configuration to False:
s = requests.session()
s.config['keep_alive'] = False
"""
There may be a subtle bug in Requests here because I WAS reading the .text and .content properties and it was still not releasing the connections. But explicitly passing 'keep alive' as false fixed the problem.

Disconnecting from host with Python Fabric when using the API

The website says:
Closing connections: Fabric’s
connection cache never closes
connections itself – it leaves this up
to whatever is using it. The fab tool
does this bookkeeping for you: it
iterates over all open connections and
closes them just before it exits
(regardless of whether the tasks
failed or not.)
Library users will need to ensure they
explicitly close all open connections
before their program exits, though we
plan to makes this easier in the
future.
I have searched everywhere, but I can't find out how to disconnect or close the connections. I am looping through my hosts and setting env.host_string. It is working, but hangs when exiting. Any help on how to close? Just to reiterate, I am using the library, not a fabfile.
If you don't want to have to iterate through all open connections, fabric.network.disconnect_all() is what you're looking for. The docstring reads
"""
Disconnect from all currently connected servers.
Used at the end of fab's main loop, and also intended for use by library users.
"""
The main.py for fabric has this:
from fabric.state import commands, connections
for key in connections.keys():
if state.output.status:
print "Disconnecting from %s..." %, denormalize(key), connections[key].close()
fabric.state.connections is a dict with the value being: paramiko.SSHClient
So off I go to close those.
You can disconnect from a specific connection, by host name, using the following code snippet (with fabric 1.10.1):
def disconnect(host):
host = host or fabric.api.env.host_string
if host and host in fabric.state.connections:
fabric.state.connections[host].get_transport().close()
from fabric.network import disconnect_all
disconnect_all()

Categories