Parallel-SSH - how to close ssh channel after a certain time?

Parallel-SSH - how to close ssh channel after a certain time? - python

Ok, so it's possible that the answer to this question is simply "stop using parallel-ssh and write your own code using netmiko/paramiko. Also, upgrade to python 3 already."
But here's my issue: I'm using parallel-ssh to try to hit as many as 80 devices at a time. These devices are notoriously unreliable, and they occasionally freeze up after giving one or two lines of output. Then, the parallel-ssh code hangs for hours, leaving the script running, well, until I kill it. I've jumped onto the VM running the scripts after a weekend and seen a job that's been stuck for 52 hours.
The relevant pieces of my first code, the one that hangs:
from pssh.pssh2_client import ParallelSSHClient
def remote_ssh(ip_list, ssh_user, ssh_pass, cmd):
client = ParallelSSHClient(ip_list, user=ssh_user, password=ssh_pass, timeout=180, retry_delay=60, pool_size=100, allow_agent=False)
result = client.run_command(cmd, stop_on_errors=False)
return result
The next thing I tried was the channel_timout option, because if it takes more than 4 minutes to get the command output, then I know that the device froze, and I need to move on and cycle it later in the script:
from pssh.pssh_client import ParallelSSHClient
def remote_ssh(ip_list, ssh_user, ssh_pass, cmd):
client = ParallelSSHClient(ip_list, user=ssh_user, password=ssh_pass, channel_timeout=180, retry_delay=60, pool_size=100, allow_agent=False)
result = client.run_command(cmd, stop_on_errors=False)
return result
This version never actually connects to anything. Any advice? I haven't been able to find anything other than channel_timeout to attempt to kill an ssh session after a certain amount of time.

The code is creating a client object inside a function and then returning only the output of run_command which includes remote channels to the SSH server.
Since the client object is never returned by the function it goes out of scope and gets garbage collected by Python which closes the connection.
Trying to use remote channels on a closed connection will never work. If you capture stack trace of the stuck script it is most probably hanging at using remote channel or connection.
Change your code to keep the client alive. Client should ideally also be reused.
from pssh.pssh2_client import ParallelSSHClient
def remote_ssh(ip_list, ssh_user, ssh_pass, cmd):
client = ParallelSSHClient(ip_list, user=ssh_user, password=ssh_pass, timeout=180, retry_delay=60, pool_size=100, allow_agent=False)
result = client.run_command(cmd, stop_on_errors=False)
return client, result
Make sure you understand where the code is going wrong before jumping to conclusions that will not solve the issue, ie capture stack trace of where it is hanging. Same code doing the same thing will break the same way..

Related

Communication between two separate Python engines

The problem statement is as follows:
I am working with Abaqus, a program for analyzing mechanical problems. It is basically a standalone Python interpreter with its own objects etc. Within this program, I run a python script to set up my analysis (so this script can be modified). It also contains a method which has to be executed when an external signal is received. These signals come from the main script that I am running in my own Python engine.
For now, I have the following workflow:
The main script sets a boolean to True when the Abaqus script has to execute a specific function, and pickles this boolean into a file. The Abaqus script regularly checks this file to see whether the boolean has been set to true. If so, it does an analysis and pickles the output, so that the main script can read this output and act on it.
I am looking for a more efficient way to signal the other process to start the analysis, since there is a lot of unnecessary checking going on right know. Data exchange via pickle is not an issue for me, but a more efficient solution is certainly welcome.
Search results always give me solutions with subprocess or the like, which is for two processes started within the same interpreter. I have also looked at ZeroMQ since this is supposed to achieve things like this, but I think this is overkill and would like a solution in python. Both interpreters are running python 2.7 (although different versions)

Edit:
Like #MattP, I'll add this statement of my understanding:
Background
I believe that you are running a product called abaqus. The abaqus product includes a linked-in python interpreter that you can access somehow (possibly by running abaqus python foo.py on the command line).
You also have a separate python installation, on the same machine. You are developing code, possibly including numpy/scipy, to run on that python installation.
These two installations are different: they have different binary interpreters, different libraries, different install paths, etc. But they live on the same physical host.
Your objective is to enable the "plain python" programs, written by you, to communicate with one or more scripts running in the "Abaqus python" environment, so that those scripts can perform work inside the Abaqus system, and return results.
Solution
Here is a socket based solution. There are two parts, abqlistener.py and abqclient.py. This approach has the advantage that it uses a well-defined mechanism for "waiting for work." No polling of files, etc. And it is a "hard" API. You can connect to a listener process from a process on the same machine, running the same version of python, or from a different machine, or from a different version of python, or from ruby or C or perl or even COBOL. It allows you to put a real "air gap" into your system, so you can develop the two parts with minimal coupling.
The server part is abqlistener. The intent is that you would copy some of this code into your Abaqus script. The abq process would then become a server, listening for connections on a specific port number, and doing work in response. Sending back a reply, or not. Et cetera.
I am not sure if you need to do setup work for each job. If so, that would have to be part of the connection. This would just start ABQ, listen on a port (forever), and deal with requests. Any job-specific setup would have to be part of the work process. (Maybe send in a parameter string, or the name of a config file, or whatever.)
The client part is abqclient. This could be moved into a module, or just copy/pasted into your existing non-ABQ program code. Basically, you open a connection to the right host:port combination, and you're talking to the server. Send in some data, get some data back, etc.
This stuff is mostly scraped from example code on-line. So it should look real familiar if you start digging into anything.
Here's abqlistener.py:
# The below usage example is completely bogus. I don't have abaqus, so
# I'm just running python2.7 abqlistener.py [options]
usage = """
abacus python abqlistener.py [--host 127.0.0.1 | --host mypc.example.com ] \\
[ --port 2525 ]
Sets up a socket listener on the host interface specified (default: all
interfaces), on the given port number (default: 2525). When a connection
is made to the socket, begins processing data.
"""
import argparse
parser = argparse.ArgumentParser(description='Abacus listener',
add_help=True,
usage=usage)
parser.add_argument('-H', '--host', metavar='INTERFACE', default='',
help='Interface IP address or name, or (default: empty string)')
parser.add_argument('-P', '--port', metavar='PORTNUM', type=int, default=2525,
help='port number of listener (default: 2525)')
args = parser.parse_args()
import SocketServer
import json
class AbqRequestHandler(SocketServer.BaseRequestHandler):
"""Request handler for our socket server.
This class is instantiated whenever a new connection is made, and
must override `handle(self)` in order to handle communicating with
the client.
"""
def do_work(self, data):
"Do some work here. Call abaqus, whatever."
print "DO_WORK: Doing work with data!"
print data
return { 'desc': 'low-precision natural constants','pi': 3, 'e': 3 }
def handle(self):
# Allow the client to send a 1kb message (file path?)
self.data = self.request.recv(1024).strip()
print "SERVER: {} wrote:".format(self.client_address[0])
print self.data
result = self.do_work(self.data)
self.response = json.dumps(result)
print "SERVER: response to {}:".format(self.client_address[0])
print self.response
self.request.sendall(self.response)
if __name__ == '__main__':
print args
server = SocketServer.TCPServer((args.host, args.port), AbqRequestHandler)
print "Server starting. Press Ctrl+C to interrupt..."
server.serve_forever()
And here's abqclient.py:
usage = """
python2.7 abqclient.py [--host HOST] [--port PORT]
Connect to abqlistener on HOST:PORT, send a message, wait for reply.
"""
import argparse
parser = argparse.ArgumentParser(description='Abacus listener',
add_help=True,
usage=usage)
parser.add_argument('-H', '--host', metavar='INTERFACE', default='',
help='Interface IP address or name, or (default: empty string)')
parser.add_argument('-P', '--port', metavar='PORTNUM', type=int, default=2525,
help='port number of listener (default: 2525)')
args = parser.parse_args()
import json
import socket
message = "I get all the best code from stackoverflow!"
print "CLIENT: Creating socket..."
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print "CLIENT: Connecting to {}:{}.".format(args.host, args.port)
s.connect((args.host, args.port))
print "CLIENT: Sending message:", message
s.send(message)
print "CLIENT: Waiting for reply..."
data = s.recv(1024)
print "CLIENT: Got response:"
print json.loads(data)
print "CLIENT: Closing socket..."
s.close()
And here's what they print when I run them together:
$ python2.7 abqlistener.py --port 3434 &
[2] 44088
$ Namespace(host='', port=3434)
Server starting. Press Ctrl+C to interrupt...
$ python2.7 abqclient.py --port 3434
CLIENT: Creating socket...
CLIENT: Connecting to :3434.
CLIENT: Sending message: I get all the best code from stackoverflow!
CLIENT: Waiting for reply...
SERVER: 127.0.0.1 wrote:
I get all the best code from stackoverflow!
DO_WORK: Doing work with data!
I get all the best code from stackoverflow!
SERVER: response to 127.0.0.1:
{"pi": 3, "e": 3, "desc": "low-precision natural constants"}
CLIENT: Got response:
{u'pi': 3, u'e': 3, u'desc': u'low-precision natural constants'}
CLIENT: Closing socket...
References:
argparse, SocketServer, json, socket are all "standard" Python libraries.

To be clear, my understanding is that you are running Abaqus/CAE via a Python script as an independent process (let's call it abq.py), which checks for, opens, and reads a trigger file to determine if it should run an analysis. The trigger file is created by a second Python process (let's call it main.py). Finally, main.py waits to read the output file created by abq.py. You want a more efficient way to signal abq.py to run an analysis, and you're open to different techniques to exchange data.
As you mentioned, subprocess or multiprocessing might be an option. However, I think a simpler solution is to combine your two scripts, and optionally use a callback function to monitor the solution and process your output. I'll assume there is no need to have abq.py constantly running as a separate process, and that all analyses can be started from main.py whenever it is appropriate.
Let main.py have access to the Abaqus Mdb. If it's already built, you open it with:
mdb = openMdb(FileName)
A trigger file is not needed if main.py starts all analyses. For example:
if SomeCondition:
j = mdb.Job(name=MyJobName, model=MyModelName)
j.submit()
j.waitForCompletion()
Once complete, main.py can read the output file and continue. This is straightforward if the data file was generated by the analysis itself (e.g. .dat or .odb files). OTH, if the output file is generated by some code in your current abq.py, then you can probably just include it in main.py instead.
If that doesn't provide enough control, instead of the waitForCompletion method you can add a callback function to the monitorManager object (which is automatically created when you import the abaqus module: from abaqus import *). This allows you to monitor and respond to various messages from the solver, such as COMPLETED, ITERATION, etc. The callback function is defined like:
def onMessage(jobName, messageType, data, userData):
if messageType == COMPLETED:
# do stuff
else:
# other stuff
Which is then added to the monitorManager and the job is called :
monitorManager.addMessageCallback(jobName=MyJobName,
messageType=ANY_MESSAGE_TYPE, callback=onMessage, userData=MyDataObj)
j = mdb.Job(name=MyJobName, model=MyModelName)
j.submit()
One of the benefits to this approach is that you can pass in a Python object as the userData argument. This could potentially be your output file, or some other data container. You could probably figure out how to process the output data within the callback function - for example, access the Odb and get the data, then do any manipulations as needed without needing the external file at all.

I agree with the answer, except for some minor syntax problems.
defining instance variables inside the handler is a no no. not to mention they are not being defined in any sort of init() method. Subclass TCPServer and define your instance variables in TCPServer.init(). Everything else will work the same.

Connecting via USB/Serial port to Newport CONEX-PP Motion Controller in Python

I'm having trouble getting my Windows 7 laptop to talk to a Newport CONEX-PP motion controller. I've tried python (Spyder/Anaconda) and a serial port streaming program called Termite and in either case the results are the same: no response from the device. The end goal is to communicate with the controller using python.
The controller connects to my computer via a USB cable they sold me that is explicitly for use with this device. The connector has a pair of lights that blink when the device receives data (red) or sends data (green). There is also a packaged GUI program that comes with the device that seems to work fine. I haven't tried every button, the ones I have tried have the expected result.
The documentation for accessing this device is next to non-existant. The CD in the box has one way to connect to it and the webpage linked above has a different way. The first way (CD from the box) creates a hierarchy of modules that ends in a module it does not recognize (this is a code snippet provided by Newport):
import sys
sys.path.append(r'C:\Newport\MotionControl\CONEX-PP\Bin')
import clr
clr.AddReference("Newport.CONEXPP.CommandInterface")
from CommandInterfaceConexPP import *
import System
instrument="COM5"
print 'Instrument Key=>', instrument
myPP = ConexPP()
ret = myPP.OpenInstrument(instrument)
print 'OpenInstrument => ', ret
result, response, errString = myPP.SR_Get(1)
That last line returns:
Traceback (most recent call last):
File "< ipython-input-2-5d824f156d8f >", line 2, in
result, response, errString = myPP.SR_Get(1)
TypeError: No method matches given arguments
I'm guessing this is because the various module references are screwy in some way. But I don't know, I'm relatively new to python and the only time I have used it for serial communication the example files provided by the vendor simply worked.
The second way to communicate with the controller is via the visa module (the CONEX_SMC_common module imports the visa module):
import sys
sys.path.append(r'C:\Newport\NewportPython')
class CONEX(CONEXSMC): def __init__(self):
super(CONEX,self).__init__() device_key = 'com5'
self.connect=self.rm.open_resource(device_key, baud_rate=57600, timeout=2000, data_bits=8, write_termination='\r\n',read_termination='\r\n')
mine.connect.read()
That last mine.connect.read() command returns:
VisaIOError: VI_ERROR_TMO (-1073807339): Timeout expired before operation completed.
If, instead, I write to the port mine.connect.write('VE') the light on the connector flashes red as if it received some data and returns:
(4L, < StatusCode.success: 0 >)
If I ask for the dictionary of the "mine" object mine.__dict__, I get:
{'connect': <'SerialInstrument'(u'ASRL5::INSTR')>,
'device_key': u'ASRL5::INSTR',
'list_of_devices': (u'ASRL5::INSTR',),
'rm': )>}
The ASRL5::INSTR resource for VISA is at least related to the controller, because when I unplug the device from the laptop it disappears and the GUI program will stop working.
Maybe there is something simple I'm missing here. I have NI VISA installed and I'm not just running with the DLL that comes from the website. Oh, I found a Github question / answer with this exact problem but the end result makes no sense, the thread is closed after hgrecco tells him to use "open_resource" which is precisely what I am using.
Results with Termite are the same, I can apparently connect to the controller and get the light to flash red, but it never responds, either through Termite or by performing the requested action.
I've tried pySerial too:
import serial
ser = serial.Serial('com5')
ser.write('VE\r\n')
ser.read()
Python just waits there forever, I assume because I haven't set a timeout limit.
So, if anyone has any experience with this particular motion controller, Newport devices or with serial port communication in general and can shed some light on this problem I'd much appreciate it. After about 7 hours on this I'm out of ideas.

After coming back at this with fresh eyes and finding this GitHub discussion I decided to give pySerial another shot because neither of the other methods in my question are yet working. The following code works:
import serial
ser = serial.Serial('com5',baudrate=115200,timeout=1.0,parity=serial.PARITY_NONE, stopbits=serial.STOPBITS_ONE, bytesize=serial.EIGHTBITS)
ser.write('1TS?\r\n')
ser.read(10)
and returns
'1TS000033\r'
The string is 9 characters long, so my arbitrarily chosen 10 character read ended up picking up one of the termination characters.
The problem is that python files that come with the device, or available on the website are at best incomplete and shouldn't be trusted for anything. The GUI manual has the baud rate required. I used Termite to figure out the stop bit settings - or at least one that works.

3.5 years later...
Here is a gist with a class that supports Conex-CC
It took me hours to solve this!
My device is Conex-CC, not PP, but it's seem to be the same idea.
For me, the serial solution didn't work because there was absolutely no response from the serial port, either through the code nor by direct TeraTerm access.
So I was trying to adapt your code to my device (because for Conex-CC, even the code you were trying was not given!).
It is important to say that import clr is based on pip install pythonnet and not pip install clr which will bring something related to colors.
After getting your error, I was looking for this Pythonnet error and have found this answer, which led me to the final solution:
import clr
# We assume Newport.CONEXCC.CommandInterface.dll is copied to our folder
clr.AddReference("Newport.CONEXCC.CommandInterface")
from CommandInterfaceConexCC import *
instrument="COM4"
print('Instrument Key=>', instrument)
myCC = ConexCC()
ret = myCC.OpenInstrument(instrument)
print('OpenInstrument => ', ret)
response = 0
errString = ''
result, response, errString = myCC.SR_Get(1, response, errString)
print('Positive SW Limit: result=%d,response=%.2f,errString=\'%s\''%(result,response,errString))
myCC.CloseInstrument()
And here is the result I've got:
Instrument Key=> COM4
OpenInstrument => 0
Positive SW Limit: result=0,response=25.00,errString=''�

For Conex-CC serial connections are possible using both pyvisa
import pyvisa
rm = pyvisa.ResourceManager()
inst = rm.open_resource('ASRL6::INSTR',baud_rate=921600, write_termination='\r\n',read_termination='\r\n')
pos = inst.query('01PA?').strip()
and serial
import serial
serial = serial.Serial(port='com6',baudrate=921600,bytesize=8,parity='N',stopbits=1,xonxoff=True)
serial.write('01PA?'.encode('ascii'))
serial.read_until(b'\r\n')
All the commands are according to the manual

python - losing connection to postgresql in daemon

I am rewriting a python script to store data from an arduino in a postgresql data base, wanting to run it as a deamon using python-daemon. The original script works fine, but in the deamon, I cannot write to the database. The first attempt ends up with:
<class 'psycopg2.DatabaseError'>, DatabaseError('SSL SYSCALL error: EOF detected\n'
and then:
<class 'psycopg2.InterfaceError'>, InterfaceError('cursor already closed',)
In the working script, I do:
connstring="dbname='"+dbdatabase+"' user='"+dbusername+"' host='"+dbhost+"'password='"+dbpassword+"'"
try:
conn = psycopg2.connect(connstring)
cur=conn.cursor()
except:
my_logger.critical(appname+": Unable to connect to the database")
sys.exit(2)
sql="insert into measure (sensorid,typeid,value) VALUES(%s,%s,%s)"
< more to set up serialport, logging and so on>
while 1:
< fetch a data set and split it to a list >
for (i,val) in enumerate measures:
try:
cur.execute(sql,(sensors[i],typeid[i],val))
conn.commit()
except:
self.logger.error(appname+": error 106 :"+str(sys.exc_info()))
I have a feeling this may be some of the same problem that I initially had with the serial connection, Serial port does not work in rewritten Python code, so I have tried to fiddle with files_preserve, doing:
self.files_preserve=range(daemon.daemon.get_maximum_file_descriptors()+1)
which as far as I can understand should keep open all file handles, but to no avail.
In the daemon, I have tried first to set up the data base connection as attributes in __init__, i. e.:
self.conn = psycopg2.connect(connstring)
self.cur=conn.cursor()
and then do the inserts in the run method. I tried also to create connection at the top of the run method and even setting it up as a global object, but in all cases, something seems to be killing the database connection. Any clues? (or any clues to where to find some documentation (other than the source) for the daemon module?)
Both the daemon and the database are running on debian linux systems with python 2.7and postgresql8.4`.

As far as i can tell from it's source, daemon.runner works by forking and then executing the run method of the daemon app you supplied.
That means that you're creating the database connection in one process, but then try to use it in a forked process, which psycopg2 doesn't like:
libpq connections shouldn’t be used by a forked processes, so [...] make sure to create the connections after the fork.
In this case that means: move your call to psycopg2.connect into the run method.

Python Twisted - Server communication

I'm having a bizarre issue. Basically, the problem I have right now is dealing with two different LineReceiver servers that are connected to each other. Essentially, if I were to input something into server A, then I want some output to appear in server B. And I would like to do this vice versa. I am running two servers on two different source files (also running them on different processes via & shellscript) ServerA.py and ServerB.py where the ports are (12650 and 12651) respectively. I am also connecting to each server using telnet.
from twisted.internet import protocol, reactor
from twisted.protocols.basic import LineReceiver
class ServerA(LineReceiver);
def connectionMade(self):
self.transport.write("Is Server A\n")
def dataReceived(self, data):
self.sendLine(data)
def lineReceived(self, line):
self.transport.write(line)
def main():
client = protocol.ClientFactory()
client.protocol = ServerA
reactor.connectTCP("localhost", 12650, client)
server = protocol.ServerFactory()
server.protocol = ServerA
reactor.listenTCP(12651, server)
reactor.run()
if __name__ == '__main__':
main()
My issue is the use of sendLine. When I try to do a sendLine call from serverA with some arbitrary string, serverA ends up spitting out the exact string instead of sending it down the connection which was done in main(). Exactly why is this happening? I've been looking around and tried each solution I came across and I can't seem to get it to work properly. The bizarre thing is my friend is essentially doing the same thing and he gets some working results but this is the simplest program I could think of to try to figure out the cause for this strange phenomenon.
In any case, the gist is, I'm expecting to get the input I put into serverA to appear in serverB.
Note: Server A and Server B have the exact same source code save for the class names and ports.

You have overridden dataReceived. That means that lineReceived will never be called, because it is LineReceiver's dataReceived implementation that eventually calls lineReceived, and you're never calling up to it.
You should only need to override lineReceived and then things should work as you expect.

Detect a closed connection in python's telnetlib

I'm using python's telnetlib to connect to a remote telnet server. I'm having a hard time detecting if the connection is still open, or if the remote server closed it on me.
I will notice the connection is closed the next time I try to read or write to it, but would like to have a way to detect it on demand.
Is there a way to send some sort of an 'Are You There' packet without affecting the actual connection? The telnet RFC supports an "are you there" and "NOP" commands - just not sure how to get telnetlib to send them!

You should be able to send a NOP this way:
from telnetlib import IAC, NOP
...
telnet_object.sock.sendall(IAC + NOP)

I've noticed that for some reason sending only once was not enough ... I've "discovered it" by accident, I had something like this:
def check_alive(telnet_obj):
try:
if telnet_obj.sock: # this way I've taken care of problem if the .close() was called
telnet_obj.sock.send(IAC+NOP) # notice the use of send instead of sendall
return True
except:
logger.info("telnet send failed - dead")
pass
# later on
logger.info("is alive %s", check_alive(my_telnet_obj))
if check_alive(my_telnet_obj):
# do whatever
after a few runs I've noticed that the log message was saying "is alive True", but the code didn't entered the "if", and that the log message "telnet send failed - dead" was printed, so in my last implementation, as I was saying here, I'm just calling the .send() method 3 times (just in case 2 were not enough).
That's my 2 cents, hope it helps

Following up on David's solution, after close() on the interface, the sock attribute changes from being a socket._socketobject to being the integer 0. The call to .sendall fails with an AttributeError if the socket is closed, so you may as well just check its type.
Tested with Linux and Windows 7.

The best way to detect if a connection is close it's by socket object. So,it's easier to check it this way,
def is_connected(telnet_obj):
return telnet_obj.get_socket().fileno()
If it is closed return -1
I took this code from this question.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parallel-SSH - how to close ssh channel after a certain time? - python

Related

Communication between two separate Python engines

Connecting via USB/Serial port to Newport CONEX-PP Motion Controller in Python

python - losing connection to postgresql in daemon

Python Twisted - Server communication

Detect a closed connection in python's telnetlib

Categories

Resources