Retaining an SSH session across multiple API calls

Retaining an SSH session across multiple API calls - python

So interesting situation here. Currently I have a simple Flask API that connects to a network device on the backend and retrieves command output.
from netmiko import ConnectHandler
def _execute_cli(self, opt, command):
"""
Internal method to create netmiko connection and
execute command.
"""
try:
net_connect = ConnectHandler(**opt)
cli_output = (net_connect.send_command(command))
except (NetMikoTimeoutException, NetMikoAuthenticationException,) as e:
reason = e.message
raise ValueError('Failed to execute cli on %s due to %s', opt['ip'], reason)
except SSHException as e:
reason = e.message
raise ValueError('Failed to execute cli on %s due to %s', opt['ip'], reason)
except Exception as e:
reason = e.message
raise ValueError('Failed to execute cli on %s due to %s', opt['ip'], reason)
return cli_output
def disconnect(connection):
connection.disconnect()
Each command output is cached locally for a period of time. The problem is, someone could make multiple connections simultaneously, and a device has a connection limit (Let's say 7). What happens is if too many calls are made, an SSH connection issue occurs because max connections have been reached.
What I'm looking to do is retain a single session across these API calls for a device for a specified period of time (Let's say, 5 minutes), and that way I'm not filling up the connections on the device.
Please advise.

Ok i am not sure what did you mean by "retain a single session across these API calls". But maybe you can do something like this.
Create a dictionary for all devices and and their user amount in every 5 min. for example:
self.users = {device:0 for device in devices}
and then whenever _execute_cli() function called, you can check the amount of users in device before sending commands(in cisco the command is 'show users') and update the variable self.users[device] = some_number.
So you can simply check like like:
if self.users[device] < 7:
try:
net_connect = ConnectHandler(**opt)
cli_output = (net_connect.send_command(command))
...

Related

Check if a socket is already opened - python

I have an azure timer function which runs every minute to trigger a socket which gets data from a website. I don't want to establish a connection everytime the timer runs the function. So, is there a way in Python which I can check if a socket is open for a particular website on particular port?
Or, is there a way to re-use a socket in time-triggered applications?
# Open socket
try:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(20) # 20 sec timeout
if is_socket_open(sock):
logging.info("Socket is already open")
else:
logging.info("No socket was open. Opening a new one...")
sock.connect(server_address)
sock.settimeout(None)
logging.info(f"Connected to {sock}")
return sock
except socket.gaierror as e:
logging.exception(f"Error connecting to remote server {e}")
time.sleep(20)
except socket.error as e:
logging.exception(f"Connection error {e}")
time.sleep(20)
except Exception as e:
logging.exception(f"An exception occurred: {e}")
time.sleep(20)
def is_socket_open(sock: socket.socket) -> bool:
try:
# this will try to read bytes without blocking and also without removing them from buffer (peek only)
data = sock.recv(16, socket.MSG_PEEK)
if len(data) == 0:
return True
except socket.timeout:
return False # socket is not connected yet, therefore receiving timed out
except BlockingIOError:
return False # socket is open and reading from it would block
except ConnectionResetError:
return True # socket was closed for some other reason
except Exception as e:
logging.exception(f"unexpected exception when checking if a socket is closed: {e}")
return False
return False
So this entire process runs every minute.

You can always use global variables to reuse objects in future invocations. The following example was copied from Google Cloud Platform documentation, but you can apply the same concept to your Azure Function:
# Global (instance-wide) scope
# This computation runs at instance cold-start
instance_var = heavy_computation()
def scope_demo(request):
# Per-function scope
# This computation runs every time this function is called
function_var = light_computation()
return 'Instance: {}; function: {}'.format(instance_var, function_var)
In your case, you can declare the sock as a global variable and reuse it in future warm start invocations. You should also increase the timeout to above 60 seconds, giving that you're triggering your azure function every minute.
However, keep in mind that there is no guarantee that the state of the function will be preserved for future invocations. For instance, in auto-scaling situations, a new socket would be open.
Microsoft Azure also says the following in regards to client connections:
To avoid holding more connections than necessary, reuse client instances rather than creating new ones with each function invocation. We recommend reusing client connections for any language that you might write your function in.
See also:
Manage connections in Azure Functions

python mysql.connector write failure on connection disconnection stalls for 30 seconds

I use python module mysql.connector for connecting to an AWS RDS instance.
Now, as we know, if we do not send a request to SQL server for a while, the connection disconnects.
To handle this, I reconnect to SQL in case a read/write request fails.
Now my problem with the "request fails", it takes significant to fail. And only then can I reconnect, and retry my request. (I have pointed this out as a comment in code snippet).
For a real-time application such as mine, this is a problem. How could I solve this? Is it possible to find out if the disconnection has already happened so that I can try a new connection without having to wait on a read/write request?
Here is how I handle it in my code right now:
def fetchFromDB(self, vid_id):
fetch_query = "SELECT * FROM <db>"
success = False
attempts = 0
output = []
while not success and attempts < self.MAX_CONN_ATTEMPTS:
try:
if self.cnx == None:
self._connectDB_()
if self.cnx:
cursor = self.cnx.cursor() # MY PROBLEM: This step takes too long to fail in case the connection has expired.
cursor.execute(fetch_query)
output = []
for entry in cursor:
output.append(entry)
cursor.close()
success = True
attempts = attempts + 1
except Exception as ex:
logging.warning("Error")
if self.cnx != None:
try:
self.cnx.close()
except Exception as ex:
pass
finally:
self.cnx = None
return output
In my application I cannot tolerate a delay of more than 1 second while reading from mysql.
While configuring mysql, I'm doing just the following settings:
SQL.user = '<username>'
SQL.password = '<password>'
SQL.host = '<AWS RDS HOST>'
SQL.port = 3306
SQL.raise_on_warnings = True
SQL.use_pure = True
SQL.database = <database-name>

There are some contrivances like generating an ALARM signal or similar if a function call takes too long. Those can be tricky with database connections or not work at all. There are other SO questions that go there.
One approach would be to set the connection_timeout to a known value when you create the connection making sure it's shorter than the server side timeout. Then if you track the age of the connection yourself you can preemptively reconnect before it gets too old and clean up the previous connection.
Alternatively you could occasionally execute a no-op query like select now(); to keep the connection open. You would still want to recycle the connection every so often.
But if there are long enough periods between queries (where they might expire) why not open a new connection for each query?

How to use boto.manage.cmdshell with ssh-agent?

I'm using boto.manage.cmdshell to create an SSH connection to EC2 instances. Currently every time the user has to enter its password to encrypt the pkey (e.g. ~/.ssh/id_rsa).
Now I want to make the work-flow more convenient for the users and support ssh-agent.
So far I tried without any success:
set ssh_key_file to None when creating FakeServer:
The result was: SSHException('Key object may not be empty')
set ssh_pwd to None when creating SSHClient:
The result was: paramiko.ssh_exception.PasswordRequiredException: Private key file is encrypted
Is there a way to use ssh-agent with boto.manage.cmdshell? I know that paramiko supports it, which boto is using.

(There's another stackoverflow page with some related answers)
Can't get amazon cmd shell to work through boto
However, you're definitely better using per-person SSH keys. But if you have those, are they in the target host's authorized_keys file? If so, users just add their key normally with ssh-add (in an ssh-agent session, usually the default in Linux). You need to test with ssh itself first, so that ssh-agent/-add issues are clearly resolved beforehand.
Once certain they work with ssh normally, the problem is whether boto thought ssh-agent at all. Paramiko's SSHClient() can, if I remember correctly - the paramiko code I remember looks roughly like:
paramiko.SSHClient().connect(host, timeout=10, username=user,
key_filename=seckey, compress=True)
The seckey was optional, so the key_filename would be empty, and that invoked checking the ssh-agent. Boto's version seems to want to force using a private key file with an explicit call like this, I think with the idea that each instance will have an assigned key and password to decrypt it:
self._pkey = paramiko.RSAKey.from_private_key_file(server.ssh_key_file,
password=ssh_pwd)
If so, it means that using boto directly conflicts with using ssh-agent and the standard model of per-user logins and logging of connections by user.
The paramiko.SSHClient() is much more capable, and documents ssh-agent support explicitly (from pydoc paramiko.SSHClient):
Authentication is attempted in the following order of priority:
- The C{pkey} or C{key_filename} passed in (if any)
- Any key we can find through an SSH agent
- Any "id_rsa" or "id_dsa" key discoverable in C{~/.ssh/}
- Plain username/password auth, if a password was given
Basically, you'd have to use paramiko instead of boto.
We had one issue with paramiko: The connection would not be ready immediately in many cases, requiring sending a test command through and checkout output before sending real commands. Part of this was that we'd start firing off SSH commands (with paramiko) right after creating and EC2 or VPC instance, so there was no guarantee it'd be listening for an SSH connect, and paramiko would tend to lose commands delivered too soon. We used some code like this to cope:
def SshCommand(**kwargs):
'''
Run a command on a remote host via SSH.
Connect to the given host=<host-or-ip>, as user=<user> (defaulting to
$USER), with optional seckey=<secret-key-file>, timeout=<seconds>
(default 10), and execute a single command=<command> (assumed to be
addressing a unix shell at the far end.
Returns the exit status of the remote command (otherwise would be
None save that an exception should be raised instead).
Example: SshCommand(host=host, user=user, command=command, timeout=timeout,
seckey=seckey)
'''
remote_exit_status = None
if debug:
sys.stderr.write('SshCommand kwargs: %r\n' % (kwargs,))
paranoid = True
host = kwargs['host']
user = kwargs['user'] if kwargs['user'] else os.environ['USER']
seckey = kwargs['seckey']
timeout = kwargs['timeout']
command = kwargs['command']
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
time_end = time.time() + int(timeout)
ssh_is_up = False
while time.time() < time_end:
try:
ssh.connect(host, timeout=10, username=user, key_filename=seckey,
compress=True)
if paranoid:
token_generator = 'echo xyz | tr a-z A-Z'
token_result = 'XYZ' # possibly buried in other text
stdin, stdout, stderr = ssh.exec_command(token_generator)
lines = ''.join(stdout.readlines())
if re.search(token_result, lines):
ssh_is_up = True
if debug:
sys.stderr.write("[%d] command stream is UP!\n"
% time.time())
break
else:
ssh_is_up = True
break
except paramiko.PasswordRequiredException as e:
sys.stderr.write("usage idiom clash: %r\n" % (e,))
return False
except Exception as e:
sys.stderr.write("[%d] command stream not yet available\n"
% time.time())
if debug:
sys.stderr.write("exception is %r\n" % (e,))
time.sleep(1)
if ssh_is_up:
# ideally this is where Bcfg2 or Chef or such ilk get called.
# stdin, stdout, stderr = ssh.exec_command(command)
chan = ssh._transport.open_session()
chan.exec_command(command)
# note that out/err doesn't have inter-stream ordering locked down.
stdout = chan.makefile('rb', -1)
stderr = chan.makefile_stderr('rb', -1)
sys.stdout.write(''.join(stdout.readlines()))
sys.stderr.write(''.join(stderr.readlines()))
remote_exit_status = chan.recv_exit_status()
if debug:
sys.stderr.write('exit status was: %d\n' % remote_exit_status)
ssh.close()
if None == remote_exit_status:
raise SSHException('remote command result undefined')
return remote_exit_status
We were also trying to enforce not logging into prod directly, so this particular wrapper (an ssh-send-command script) encourage scripting despite the vagaries of whether Amazon had bothered to start the instance in a timely fashion.

I found a solution to my problem by creating a class SSHClientAgent which inherited from boto.manage.cmdshell.SSHClient and overwrites the __init__(). In the new __init__() I replaced the call to paramiko.RSAKey.from_private_key_file() with None.
Here is my new class:
class SSHClientAgent(boto.manage.cmdshell.SSHClient):
def __init__(self, server,
host_key_file='~/.ssh/known_hosts',
uname='root', timeout=None, ssh_pwd=None):
self.server = server
self.host_key_file = host_key_file
self.uname = uname
self._timeout = timeout
# replace the call to get the private key
self._pkey = None
self._ssh_client = paramiko.SSHClient()
self._ssh_client.load_system_host_keys()
self._ssh_client.load_host_keys(os.path.expanduser(host_key_file))
self._ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
self.connect()
In my function where I create the ssh connection I check for the environment variable SSH_AUTH_SOCK and decide which ssh client to create.

Resume FTP download after timeout

I'm downloading files from a flaky FTP server that often times out during file transfer and I was wondering if there was a way to reconnect and resume the download. I'm using Python's ftplib. Here is the code that I am using:
#! /usr/bin/python
import ftplib
import os
import socket
import sys
#--------------------------------#
# Define parameters for ftp site #
#--------------------------------#
site = 'a.really.unstable.server'
user = 'anonymous'
password = 'someperson#somewhere.edu'
root_ftp_dir = '/directory1/'
root_local_dir = '/directory2/'
#---------------------------------------------------------------
# Tuple of order numbers to download. Each web request generates
# an order numbers
#---------------------------------------------------------------
order_num = ('1','2','3','4')
#----------------------------------------------------------------#
# Loop through each order. Connect to server on each loop. There #
# might be a time out for the connection therefore reconnect for #
# every new ordernumber #
#----------------------------------------------------------------#
# First change local directory
os.chdir(root_local_dir)
# Begin loop through
for order in order_num:
print 'Begin Proccessing order number %s' %order
# Connect to FTP site
try:
ftp = ftplib.FTP( host=site, timeout=1200 )
except (socket.error, socket.gaierror), e:
print 'ERROR: Unable to reach "%s"' %site
sys.exit()
# Login
try:
ftp.login(user,password)
except ftplib.error_perm:
print 'ERROR: Unable to login'
ftp.quit()
sys.exit()
# Change remote directory to location of order
try:
ftp.cwd(root_ftp_dir+order)
except ftplib.error_perm:
print 'Unable to CD to "%s"' %(root_ftp_dir+order)
sys.exit()
# Get a list of files
try:
filelist = ftp.nlst()
except ftplib.error_perm:
print 'Unable to get file list from "%s"' %order
sys.exit()
#---------------------------------#
# Loop through files and download #
#---------------------------------#
for each_file in filelist:
file_local = open(each_file,'wb')
try:
ftp.retrbinary('RETR %s' %each_file, file_local.write)
file_local.close()
except ftplib.error_perm:
print 'ERROR: cannot read file "%s"' %each_file
os.unlink(each_file)
ftp.quit()
print 'Finished Proccessing order number %s' %order
sys.exit()
The error that I get:
socket.error: [Errno 110] Connection timed out
Any help is greatly appreciated.

Resuming a download through FTP using only standard facilities (see RFC959) requires use of the block transmission mode (section 3.4.2), which can be set using the MODE B command. Although this feature is technically required for conformance to the specification, I'm not sure all FTP server software implements it.
In the block transmission mode, as opposed to the stream transmission mode, the server sends the file in chunks, each of which has a marker. This marker may be re-submitted to the server to restart a failed transfer (section 3.5).
The specification says:
[...] a restart procedure is provided to protect users from gross system failures (including failures of a host, an FTP-process, or the underlying network).
However, AFAIK, the specification does not define a required lifetime for markers. It only says the following:
The marker information has meaning only to the sender, but must consist of printable characters in the default or negotiated language of the control connection (ASCII or EBCDIC). The marker could represent a bit-count, a record-count, or any other information by which a system may identify a data checkpoint. The receiver of data, if it implements the restart procedure, would then mark the corresponding position of this marker in the receiving system, and return this information to the user.
It should be safe to assume that servers implementing this feature will provide markers that are valid between FTP sessions, but your mileage may vary.

A simple example for implementing a resumable FTP download using Python ftplib:
def connect():
ftp = None
with open('bigfile', 'wb') as f:
while (not finished):
if ftp is None:
print("Connecting...")
FTP(host, user, passwd)
try:
rest = f.tell()
if rest == 0:
rest = None
print("Starting new transfer...")
else:
print(f"Resuming transfer from {rest}...")
ftp.retrbinary('RETR bigfile', f.write, rest=rest)
print("Done")
finished = True
except Exception as e:
ftp = None
sec = 5
print(f"Transfer failed: {e}, will retry in {sec} seconds...")
time.sleep(sec)
More fine-grained exception handling is advisable.
Similarly for uploads:
Handling disconnects in Python ftplib FTP transfers file upload

To do this, you would have to keep the interrupted download, then figure out which parts of the file you are missing, download those parts and then connect them together. I'm not sure how to do this, but there is a download manager for Firefox and Chrome called DownThemAll that does this. Although the code is not written in python (I think it's JavaScript), you could look at the code and see how it does this.
DownThemll - http://www.downthemall.net/

Python DBAPI time out for connections?

I was attempting to test for connection failure, and unfortunately it's not failing if the IP address of the host is fire walled.
This is the code:
def get_connection(self, conn_data):
rtu, hst, prt, usr, pwd, db = conn_data
try:
self.conn = pgdb.connect(host=hst+":"+prt, user=usr, password=pwd, database=db)
self.cur = self.conn.cursor()
return True
except pgdb.Error as e:
logger.exception("Error trying to connect to the server.")
return False
if self.get_connection(conn_data):
# Do stuff here:
If I try to connect to a known server but give an incorrect user name, it will trigger the exception and fail.
However if I try to connect to a machine that does not respond (firewalled) it never gets passed self.conn = pgdb.connect()
How to I wait or test for time out rather than have my app appear to hang when a user mistypes an IP address?

What you are experiencing is the pain of firewalls, and the timeout is the normal TCP timeout.

You can usually pass timeout argument in connect function. If it doesn't exist you could try with socket.timeout or default timeout:
import socket
socket.setdefaulttimeout(10) # sets timeout to 10 seconds
This will apply this setting to all connections(socket based) you make and will fail after 10 seconds of waiting.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Retaining an SSH session across multiple API calls - python

Related

Check if a socket is already opened - python

python mysql.connector write failure on connection disconnection stalls for 30 seconds

How to use boto.manage.cmdshell with ssh-agent?

Resume FTP download after timeout

Python DBAPI time out for connections?

Categories

Resources