Communication between two separate Python engines - python

The problem statement is as follows:
I am working with Abaqus, a program for analyzing mechanical problems. It is basically a standalone Python interpreter with its own objects etc. Within this program, I run a python script to set up my analysis (so this script can be modified). It also contains a method which has to be executed when an external signal is received. These signals come from the main script that I am running in my own Python engine.
For now, I have the following workflow:
The main script sets a boolean to True when the Abaqus script has to execute a specific function, and pickles this boolean into a file. The Abaqus script regularly checks this file to see whether the boolean has been set to true. If so, it does an analysis and pickles the output, so that the main script can read this output and act on it.
I am looking for a more efficient way to signal the other process to start the analysis, since there is a lot of unnecessary checking going on right know. Data exchange via pickle is not an issue for me, but a more efficient solution is certainly welcome.
Search results always give me solutions with subprocess or the like, which is for two processes started within the same interpreter. I have also looked at ZeroMQ since this is supposed to achieve things like this, but I think this is overkill and would like a solution in python. Both interpreters are running python 2.7 (although different versions)

Edit:
Like #MattP, I'll add this statement of my understanding:
Background
I believe that you are running a product called abaqus. The abaqus product includes a linked-in python interpreter that you can access somehow (possibly by running abaqus python foo.py on the command line).
You also have a separate python installation, on the same machine. You are developing code, possibly including numpy/scipy, to run on that python installation.
These two installations are different: they have different binary interpreters, different libraries, different install paths, etc. But they live on the same physical host.
Your objective is to enable the "plain python" programs, written by you, to communicate with one or more scripts running in the "Abaqus python" environment, so that those scripts can perform work inside the Abaqus system, and return results.
Solution
Here is a socket based solution. There are two parts, abqlistener.py and abqclient.py. This approach has the advantage that it uses a well-defined mechanism for "waiting for work." No polling of files, etc. And it is a "hard" API. You can connect to a listener process from a process on the same machine, running the same version of python, or from a different machine, or from a different version of python, or from ruby or C or perl or even COBOL. It allows you to put a real "air gap" into your system, so you can develop the two parts with minimal coupling.
The server part is abqlistener. The intent is that you would copy some of this code into your Abaqus script. The abq process would then become a server, listening for connections on a specific port number, and doing work in response. Sending back a reply, or not. Et cetera.
I am not sure if you need to do setup work for each job. If so, that would have to be part of the connection. This would just start ABQ, listen on a port (forever), and deal with requests. Any job-specific setup would have to be part of the work process. (Maybe send in a parameter string, or the name of a config file, or whatever.)
The client part is abqclient. This could be moved into a module, or just copy/pasted into your existing non-ABQ program code. Basically, you open a connection to the right host:port combination, and you're talking to the server. Send in some data, get some data back, etc.
This stuff is mostly scraped from example code on-line. So it should look real familiar if you start digging into anything.
Here's abqlistener.py:
# The below usage example is completely bogus. I don't have abaqus, so
# I'm just running python2.7 abqlistener.py [options]
usage = """
abacus python abqlistener.py [--host 127.0.0.1 | --host mypc.example.com ] \\
[ --port 2525 ]
Sets up a socket listener on the host interface specified (default: all
interfaces), on the given port number (default: 2525). When a connection
is made to the socket, begins processing data.
"""
import argparse
parser = argparse.ArgumentParser(description='Abacus listener',
add_help=True,
usage=usage)
parser.add_argument('-H', '--host', metavar='INTERFACE', default='',
help='Interface IP address or name, or (default: empty string)')
parser.add_argument('-P', '--port', metavar='PORTNUM', type=int, default=2525,
help='port number of listener (default: 2525)')
args = parser.parse_args()
import SocketServer
import json
class AbqRequestHandler(SocketServer.BaseRequestHandler):
"""Request handler for our socket server.
This class is instantiated whenever a new connection is made, and
must override `handle(self)` in order to handle communicating with
the client.
"""
def do_work(self, data):
"Do some work here. Call abaqus, whatever."
print "DO_WORK: Doing work with data!"
print data
return { 'desc': 'low-precision natural constants','pi': 3, 'e': 3 }
def handle(self):
# Allow the client to send a 1kb message (file path?)
self.data = self.request.recv(1024).strip()
print "SERVER: {} wrote:".format(self.client_address[0])
print self.data
result = self.do_work(self.data)
self.response = json.dumps(result)
print "SERVER: response to {}:".format(self.client_address[0])
print self.response
self.request.sendall(self.response)
if __name__ == '__main__':
print args
server = SocketServer.TCPServer((args.host, args.port), AbqRequestHandler)
print "Server starting. Press Ctrl+C to interrupt..."
server.serve_forever()
And here's abqclient.py:
usage = """
python2.7 abqclient.py [--host HOST] [--port PORT]
Connect to abqlistener on HOST:PORT, send a message, wait for reply.
"""
import argparse
parser = argparse.ArgumentParser(description='Abacus listener',
add_help=True,
usage=usage)
parser.add_argument('-H', '--host', metavar='INTERFACE', default='',
help='Interface IP address or name, or (default: empty string)')
parser.add_argument('-P', '--port', metavar='PORTNUM', type=int, default=2525,
help='port number of listener (default: 2525)')
args = parser.parse_args()
import json
import socket
message = "I get all the best code from stackoverflow!"
print "CLIENT: Creating socket..."
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print "CLIENT: Connecting to {}:{}.".format(args.host, args.port)
s.connect((args.host, args.port))
print "CLIENT: Sending message:", message
s.send(message)
print "CLIENT: Waiting for reply..."
data = s.recv(1024)
print "CLIENT: Got response:"
print json.loads(data)
print "CLIENT: Closing socket..."
s.close()
And here's what they print when I run them together:
$ python2.7 abqlistener.py --port 3434 &
[2] 44088
$ Namespace(host='', port=3434)
Server starting. Press Ctrl+C to interrupt...
$ python2.7 abqclient.py --port 3434
CLIENT: Creating socket...
CLIENT: Connecting to :3434.
CLIENT: Sending message: I get all the best code from stackoverflow!
CLIENT: Waiting for reply...
SERVER: 127.0.0.1 wrote:
I get all the best code from stackoverflow!
DO_WORK: Doing work with data!
I get all the best code from stackoverflow!
SERVER: response to 127.0.0.1:
{"pi": 3, "e": 3, "desc": "low-precision natural constants"}
CLIENT: Got response:
{u'pi': 3, u'e': 3, u'desc': u'low-precision natural constants'}
CLIENT: Closing socket...
References:
argparse, SocketServer, json, socket are all "standard" Python libraries.

To be clear, my understanding is that you are running Abaqus/CAE via a Python script as an independent process (let's call it abq.py), which checks for, opens, and reads a trigger file to determine if it should run an analysis. The trigger file is created by a second Python process (let's call it main.py). Finally, main.py waits to read the output file created by abq.py. You want a more efficient way to signal abq.py to run an analysis, and you're open to different techniques to exchange data.
As you mentioned, subprocess or multiprocessing might be an option. However, I think a simpler solution is to combine your two scripts, and optionally use a callback function to monitor the solution and process your output. I'll assume there is no need to have abq.py constantly running as a separate process, and that all analyses can be started from main.py whenever it is appropriate.
Let main.py have access to the Abaqus Mdb. If it's already built, you open it with:
mdb = openMdb(FileName)
A trigger file is not needed if main.py starts all analyses. For example:
if SomeCondition:
j = mdb.Job(name=MyJobName, model=MyModelName)
j.submit()
j.waitForCompletion()
Once complete, main.py can read the output file and continue. This is straightforward if the data file was generated by the analysis itself (e.g. .dat or .odb files). OTH, if the output file is generated by some code in your current abq.py, then you can probably just include it in main.py instead.
If that doesn't provide enough control, instead of the waitForCompletion method you can add a callback function to the monitorManager object (which is automatically created when you import the abaqus module: from abaqus import *). This allows you to monitor and respond to various messages from the solver, such as COMPLETED, ITERATION, etc. The callback function is defined like:
def onMessage(jobName, messageType, data, userData):
if messageType == COMPLETED:
# do stuff
else:
# other stuff
Which is then added to the monitorManager and the job is called :
monitorManager.addMessageCallback(jobName=MyJobName,
messageType=ANY_MESSAGE_TYPE, callback=onMessage, userData=MyDataObj)
j = mdb.Job(name=MyJobName, model=MyModelName)
j.submit()
One of the benefits to this approach is that you can pass in a Python object as the userData argument. This could potentially be your output file, or some other data container. You could probably figure out how to process the output data within the callback function - for example, access the Odb and get the data, then do any manipulations as needed without needing the external file at all.

I agree with the answer, except for some minor syntax problems.
defining instance variables inside the handler is a no no. not to mention they are not being defined in any sort of init() method. Subclass TCPServer and define your instance variables in TCPServer.init(). Everything else will work the same.

Related

Parallel-SSH - how to close ssh channel after a certain time?

Ok, so it's possible that the answer to this question is simply "stop using parallel-ssh and write your own code using netmiko/paramiko. Also, upgrade to python 3 already."
But here's my issue: I'm using parallel-ssh to try to hit as many as 80 devices at a time. These devices are notoriously unreliable, and they occasionally freeze up after giving one or two lines of output. Then, the parallel-ssh code hangs for hours, leaving the script running, well, until I kill it. I've jumped onto the VM running the scripts after a weekend and seen a job that's been stuck for 52 hours.
The relevant pieces of my first code, the one that hangs:
from pssh.pssh2_client import ParallelSSHClient
def remote_ssh(ip_list, ssh_user, ssh_pass, cmd):
client = ParallelSSHClient(ip_list, user=ssh_user, password=ssh_pass, timeout=180, retry_delay=60, pool_size=100, allow_agent=False)
result = client.run_command(cmd, stop_on_errors=False)
return result
The next thing I tried was the channel_timout option, because if it takes more than 4 minutes to get the command output, then I know that the device froze, and I need to move on and cycle it later in the script:
from pssh.pssh_client import ParallelSSHClient
def remote_ssh(ip_list, ssh_user, ssh_pass, cmd):
client = ParallelSSHClient(ip_list, user=ssh_user, password=ssh_pass, channel_timeout=180, retry_delay=60, pool_size=100, allow_agent=False)
result = client.run_command(cmd, stop_on_errors=False)
return result
This version never actually connects to anything. Any advice? I haven't been able to find anything other than channel_timeout to attempt to kill an ssh session after a certain amount of time.
The code is creating a client object inside a function and then returning only the output of run_command which includes remote channels to the SSH server.
Since the client object is never returned by the function it goes out of scope and gets garbage collected by Python which closes the connection.
Trying to use remote channels on a closed connection will never work. If you capture stack trace of the stuck script it is most probably hanging at using remote channel or connection.
Change your code to keep the client alive. Client should ideally also be reused.
from pssh.pssh2_client import ParallelSSHClient
def remote_ssh(ip_list, ssh_user, ssh_pass, cmd):
client = ParallelSSHClient(ip_list, user=ssh_user, password=ssh_pass, timeout=180, retry_delay=60, pool_size=100, allow_agent=False)
result = client.run_command(cmd, stop_on_errors=False)
return client, result
Make sure you understand where the code is going wrong before jumping to conclusions that will not solve the issue, ie capture stack trace of where it is hanging. Same code doing the same thing will break the same way..

Is it possible to have a python shell in a py2exe build?

I distribute a software package for windows via distutils and py2exe. For development purposes, I'd like to be able to have access to a python console inside the py2exe build. I see there is a python27.dll file in the py2exe build, so I hope I can leverage that to launch a python terminal.
Is it possible to take an existing, or modify distutils/py2exe to get end user access to a Python shell in the py2exe environment?
There's a really bare-bones way to accomplish this as documented by Matt Anderson from the pymntos google group. I've seen some variations on it, but this one came up first when I googled. :p
The juice is in the stdlib code module, leveraging code.InteractiveInterpeter. The only thing you'd have to do is add this in as a thread as the application starts. Then, when the app starts you can telnet 'localhost 7777' and you should drop into a Python interpreter.
The problem with doing it as a thread though - you can't very easily twiddle variables / data in the main thread without doing some sort of queue and passing things around.
You could alternatively have an async socket - that way you could twiddle stuff as a main-thread participant. Thats inherently dangerous for a host of reasons. But, we're talking bare metal.
If you use the Twisted library, you could use Twisted Conch, which allows you to create an SSH or Telnet server that can talk to the rest of your app. However, this might be a problem since you're using the event loop from the UI to process events - you can't have two event loops. If you're using Qt, there is a Twisted Qt Reactor event loop. If it's windows or something else.. I have no idea. But, this should at least give you a few things to consider.
Original link is: https://groups.google.com/forum/?fromgroups#!topic/pymntos/-Mjviu7R2bs
import socket
import code
import sys
class MyConsole(code.InteractiveConsole):
def __init__(self, rfile, wfile, locals=None):
self.rfile = rfile
self.wfile = wfile
code.InteractiveConsole.__init__(
self, locals=locals, filename='<MyConsole>')
def raw_input(self, prompt=''):
self.wfile.write(prompt)
return self.rfile.readline().rstrip()
def write(self, data):
self.wfile.write(data)
netloc = ('', 7777)
servsock = socket.socket()
servsock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, True)
servsock.bind(netloc)
servsock.listen(5)
print 'listening'
sock, _ = servsock.accept()
print 'accepted'
rfile = sock.makefile('r', 0)
sys.stdout = wfile = sock.makefile('w', 0)
console = MyConsole(rfile, wfile)
console.interact()

Python Twisted - Server communication

I'm having a bizarre issue. Basically, the problem I have right now is dealing with two different LineReceiver servers that are connected to each other. Essentially, if I were to input something into server A, then I want some output to appear in server B. And I would like to do this vice versa. I am running two servers on two different source files (also running them on different processes via & shellscript) ServerA.py and ServerB.py where the ports are (12650 and 12651) respectively. I am also connecting to each server using telnet.
from twisted.internet import protocol, reactor
from twisted.protocols.basic import LineReceiver
class ServerA(LineReceiver);
def connectionMade(self):
self.transport.write("Is Server A\n")
def dataReceived(self, data):
self.sendLine(data)
def lineReceived(self, line):
self.transport.write(line)
def main():
client = protocol.ClientFactory()
client.protocol = ServerA
reactor.connectTCP("localhost", 12650, client)
server = protocol.ServerFactory()
server.protocol = ServerA
reactor.listenTCP(12651, server)
reactor.run()
if __name__ == '__main__':
main()
My issue is the use of sendLine. When I try to do a sendLine call from serverA with some arbitrary string, serverA ends up spitting out the exact string instead of sending it down the connection which was done in main(). Exactly why is this happening? I've been looking around and tried each solution I came across and I can't seem to get it to work properly. The bizarre thing is my friend is essentially doing the same thing and he gets some working results but this is the simplest program I could think of to try to figure out the cause for this strange phenomenon.
In any case, the gist is, I'm expecting to get the input I put into serverA to appear in serverB.
Note: Server A and Server B have the exact same source code save for the class names and ports.
You have overridden dataReceived. That means that lineReceived will never be called, because it is LineReceiver's dataReceived implementation that eventually calls lineReceived, and you're never calling up to it.
You should only need to override lineReceived and then things should work as you expect.

UNIX named PIPE end of file

I'm trying to use a unix named pipe to output statistics of a running service. I intend to provide a similar interface as /proc where one can see live stats by catting a file.
I'm using a code similar to this in my python code:
while True:
f = open('/tmp/readstatshere', 'w')
f.write('some interesting stats\n')
f.close()
/tmp/readstatshere is a named pipe created by mknod.
I then cat it to see the stats:
$ cat /tmp/readstatshere
some interesting stats
It works fine most of the time. However, if I cat the entry several times in quick successions, sometimes I get multiple lines of some interesting stats instead of one. Once or twice, it has even gone into an infinite loop printing that line forever until I killed it. The only fix that I've got so far is to put a delay of let's say 500ms after f.close() to prevent this issue.
I'd like to know why exactly this happens and if there is a better way of dealing with it.
Thanks in advance
A pipe is simply the wrong solution here. If you want to present a consistent snapshot of the internal state of your process, write that to a temporary file and then rename it to the "public" name. This will prevent all issues that can arise from other processes reading the state while you're updating it. Also, do NOT do that in a busy loop, but ideally in a thread that sleeps for at least one second between updates.
What about a UNIX socket instead of a pipe?
In this case, you can react on each connect by providing fresh data just in time.
The only downside is that you cannot cat the data; you'll have to create a new socket handle and connect() to the socket file.
MYSOCKETFILE = '/tmp/mysocket'
import socket
import os
try:
os.unlink(MYSOCKETFILE)
except OSError: pass
s = socket.socket(socket.AF_UNIX)
s.bind(MYSOCKETFILE)
s.listen(10)
while True:
s2, peeraddr = s.accept()
s2.send('These are my actual data')
s2.close()
Program querying this socket:
MYSOCKETFILE = '/tmp/mysocket'
import socket
import os
s = socket.socket(socket.AF_UNIX)
s.connect(MYSOCKETFILE)
while True:
d = s.recv(100)
if not d: break
print d
s.close()
I think you should use fuse.
it has python bindings, see http://pypi.python.org/pypi/fuse-python/
this allows you to compose answers to questions formulated as posix filesystem system calls
Don't write to an actual file. That's not what /proc does. Procfs presents a virtual (non-disk-backed) filesystem which produces the information you want on demand. You can do the same thing, but it'll be easier if it's not tied to the filesystem. Instead, just run a web service inside your Python program, and keep your statistics in memory. When a request comes in for the stats, formulate them into a nice string and return them. Most of the time you won't need to waste cycles updating a file which may not even be read before the next update.
You need to unlink the pipe after you issue the close. I think this is because there is a race condition where the pipe can be opened for reading again before cat finishes and it thus sees more data and reads it out, leading to multiples of "some interesting stats."
Basically you want something like:
while True:
os.mkfifo(the_pipe)
f = open(the_pipe, 'w')
f.write('some interesting stats')
f.close()
os.unlink(the_pipe)
Update 1: call to mkfifo
Update 2: as noted in the comments, there is a race condition in this code as well with multiple consumers.

Execute python code without invoking import statement each time

Here is a sample python script. How do I run this script multiple times from command line so that the import line is not called every time? The import statement takes too long to load.
import arcpy
val = arcpy.GetCellValue_management("D:\dem-merged\lidar_wsg84", "-95.090174910630012 29.973962146120652", "")
print str(val)
This problem has no solution if you strictly want this script "to be called from another program. by issuing 'python script.py' on command line".
If you want to do the "heavy import" only once, you have to start python script only once.
Think about starting a daemon, which will start once and then process calls from other program. This way all initialization has to be done only one time and next calls will be fast.
And if you split your python code into two parts (first part for daemon, second for daemon client), you'll be able to call 'python client.py' from another program, but actual computation will be performed by daemon, which is started just one time.
As example:
daemon.py
import socket
#import arcpy
def actual_work():
#val = arcpy.GetCellValue_management("D:\dem-merged\lidar_wsg84", "-95.090174910630012 29.973962146120652", "")
#return str(val)
return 'dummy_reply'
def main():
sock = socket.socket( socket.AF_INET, socket.SOCK_DGRAM )
try:
sock.bind( ('127.0.0.1', 6666) )
while True:
data, addr = sock.recvfrom( 4096 )
reply = actual_work()
sock.sendto(reply, addr)
except KeyboardInterrupt:
pass
finally:
sock.close()
if __name__ == '__main__':
main()
client.py
import socket
import sys
def main():
sock = socket.socket( socket.AF_INET, socket.SOCK_DGRAM )
sock.settimeout(1)
try:
sock.sendto('', ('127.0.0.1', 6666))
reply, _ = sock.recvfrom(4096)
print reply
except socket.timeout:
sys.exit(1)
finally:
sock.close()
if __name__ == '__main__':
main()
It's virtually impossible. Once you leave the interpreter, the modules that were imported are no longer in the memory. It's similar to asking Firefox to save large webpages in memory because the read rate to the cache takes too long. Once Firefox (or Python) is shut off, it's pretty much bye-bye anything in the RAM.
You can make the load time faster, but at your own risk. By running
python -O
you can make it go a bit faster. You can also add another 'O' to make it go just a bit faster. However, this can make some programs buggy and doesn't always work.
You could copy the functions you need into your program by doing
from arcpy import <what you need>
and that might make things go slightly faster.
As far as I know the module gets imported once. So if you do:
import a
import a
it only gets imported once. So instead of running the script many times, maybe you can change it to make all the copies in one go.
If you have to run this specific script many times, I think you can't avoid the import and you'll have to import it every time.
One solution I can think of is to have a server process that runs persistently that does the actual work, while the script that's actually invoked from the command line merely issues requests to that script. This is a fair bit of work, but it may be worth it.
The only solution I can think of is to copy the individual function(s) you need into your code manually, if what you need to execute is small enough.
If you need help on how to do this, just ask in the comments.
Looking at your use case (calling it from a Ruby on Rails webservice), one of the easiest ways would be to use XML-RPC. Use the SimpleXMLRPCServer from the python standard lib, and then use a ruby client (ruby seems to have xmlrpc in the standard lib)?
Easy.
Write your own simple shell using the cmd module and use the runpy module to run your scripts. Import you big module in the shell program and pass it to the programs using init_globals
Look through the docs for http://pypi.python.org/pypi/cmd2/ and it should be fairly clear how you can write your own simple shell, even if it just has two commands, one to edit a file and one to run it.
runpy is part of the Python standard library http://docs.python.org/library/runpy.html and you may not need it, but it is useful to know that the import and module loading mechanism can be controlled and even modified by your command shell.
Have you ever wondered where the name "var1" goes when you execute something like var1 = 25? How does Python find what var1 refers to when you later execute print var1? The answer is that these names are in a dictionary and if you understand what Python dictionaries are and what they can do, it seems like an obvious solution to the problem of connecting names with values. But there's more. Python can have lots of namespaces and you can manipulate those namespaces the same way you manipulate dictionaries. Read this http://www.diveintopython.net/html_processing/locals_and_globals.html to understand the locals and globals namespace. Here is another discussion that will help http://lucumr.pocoo.org/2011/2/1/exec-in-python/
Play around with exec like in this question globals and locals in python exec() until you understand how it works. Then build your command shell to import the module one time at the beginning, and write your scripts to only import the module if it is not already available. When the script is run from inside your shell, the module will already be there.

Categories