Execute python code without invoking import statement each time - python

Here is a sample python script. How do I run this script multiple times from command line so that the import line is not called every time? The import statement takes too long to load.
import arcpy
val = arcpy.GetCellValue_management("D:\dem-merged\lidar_wsg84", "-95.090174910630012 29.973962146120652", "")
print str(val)

This problem has no solution if you strictly want this script "to be called from another program. by issuing 'python script.py' on command line".
If you want to do the "heavy import" only once, you have to start python script only once.
Think about starting a daemon, which will start once and then process calls from other program. This way all initialization has to be done only one time and next calls will be fast.
And if you split your python code into two parts (first part for daemon, second for daemon client), you'll be able to call 'python client.py' from another program, but actual computation will be performed by daemon, which is started just one time.
As example:
daemon.py
import socket
#import arcpy
def actual_work():
#val = arcpy.GetCellValue_management("D:\dem-merged\lidar_wsg84", "-95.090174910630012 29.973962146120652", "")
#return str(val)
return 'dummy_reply'
def main():
sock = socket.socket( socket.AF_INET, socket.SOCK_DGRAM )
try:
sock.bind( ('127.0.0.1', 6666) )
while True:
data, addr = sock.recvfrom( 4096 )
reply = actual_work()
sock.sendto(reply, addr)
except KeyboardInterrupt:
pass
finally:
sock.close()
if __name__ == '__main__':
main()
client.py
import socket
import sys
def main():
sock = socket.socket( socket.AF_INET, socket.SOCK_DGRAM )
sock.settimeout(1)
try:
sock.sendto('', ('127.0.0.1', 6666))
reply, _ = sock.recvfrom(4096)
print reply
except socket.timeout:
sys.exit(1)
finally:
sock.close()
if __name__ == '__main__':
main()

It's virtually impossible. Once you leave the interpreter, the modules that were imported are no longer in the memory. It's similar to asking Firefox to save large webpages in memory because the read rate to the cache takes too long. Once Firefox (or Python) is shut off, it's pretty much bye-bye anything in the RAM.
You can make the load time faster, but at your own risk. By running
python -O
you can make it go a bit faster. You can also add another 'O' to make it go just a bit faster. However, this can make some programs buggy and doesn't always work.
You could copy the functions you need into your program by doing
from arcpy import <what you need>
and that might make things go slightly faster.

As far as I know the module gets imported once. So if you do:
import a
import a
it only gets imported once. So instead of running the script many times, maybe you can change it to make all the copies in one go.
If you have to run this specific script many times, I think you can't avoid the import and you'll have to import it every time.

One solution I can think of is to have a server process that runs persistently that does the actual work, while the script that's actually invoked from the command line merely issues requests to that script. This is a fair bit of work, but it may be worth it.

The only solution I can think of is to copy the individual function(s) you need into your code manually, if what you need to execute is small enough.
If you need help on how to do this, just ask in the comments.

Looking at your use case (calling it from a Ruby on Rails webservice), one of the easiest ways would be to use XML-RPC. Use the SimpleXMLRPCServer from the python standard lib, and then use a ruby client (ruby seems to have xmlrpc in the standard lib)?

Easy.
Write your own simple shell using the cmd module and use the runpy module to run your scripts. Import you big module in the shell program and pass it to the programs using init_globals
Look through the docs for http://pypi.python.org/pypi/cmd2/ and it should be fairly clear how you can write your own simple shell, even if it just has two commands, one to edit a file and one to run it.
runpy is part of the Python standard library http://docs.python.org/library/runpy.html and you may not need it, but it is useful to know that the import and module loading mechanism can be controlled and even modified by your command shell.
Have you ever wondered where the name "var1" goes when you execute something like var1 = 25? How does Python find what var1 refers to when you later execute print var1? The answer is that these names are in a dictionary and if you understand what Python dictionaries are and what they can do, it seems like an obvious solution to the problem of connecting names with values. But there's more. Python can have lots of namespaces and you can manipulate those namespaces the same way you manipulate dictionaries. Read this http://www.diveintopython.net/html_processing/locals_and_globals.html to understand the locals and globals namespace. Here is another discussion that will help http://lucumr.pocoo.org/2011/2/1/exec-in-python/
Play around with exec like in this question globals and locals in python exec() until you understand how it works. Then build your command shell to import the module one time at the beginning, and write your scripts to only import the module if it is not already available. When the script is run from inside your shell, the module will already be there.

Related

is it possible to pass data from one python program to another python program? [duplicate]

Is it possible -- other than by using something like a .txt/dummy file -- to pass a value from one program to another?
I have a program that uses a .txt file to pass a starting value to another program. I update the value in the file in between starting the program each time I run it (ten times, essentially simultaneously). Doing this is fine, but I would like to have the 'child' program report back to the 'mother' program when it is finished, and also report back what files it found to download.
Is it possible to do this without using eleven files to do it (that's one for each instance of the 'child' to 'mother' reporting, and one file for the 'mother' to 'child')? I am talking about completely separate programs, not classes or functions or anything like that.
To operate efficently, and not be waiting around for hours for everything to complete, I need the 'child' program to run ten times and get things done MUCH faster. Thus I run the child program ten times and give each program a separate range to check through.
Both programs run fine, I but would like to get them to run/report back and forth with each other and hopefully not be using file 'transmission' to accomplish the task, especially on the child-mother side of the transferring of data.
'Mother' program...currently
import os
import sys
import subprocess
import time
os.chdir ('/media/')
#find highest download video
Hival = open("Highest.txt", "r")
Histr = Hival.read()
Hival.close()
HiNext = str(int(Histr)+1)
#setup download #1
NextVal = open("NextVal.txt","w")
NextVal.write(HiNext)
NextVal.close()
#call download #1
procs=[]
proc=subprocess.Popen(['python','test.py'])
procs.append(proc)
time.sleep(2)
#setup download #2-11
Histr2 = int(Histr)/10000
Histr2 = Histr2 + 1
for i in range(10):
Hiint = str(Histr2)+"0000"
NextVal = open("NextVal.txt","w")
NextVal.write(Hiint)
NextVal.close()
proc=subprocess.Popen(['python','test.py'])
procs.append(proc)
time.sleep(2)
Histr2 = Histr2 + 1
for proc in procs:
proc.wait()
'Child' program
import urllib
import os
from Tkinter import *
import time
root = Tk()
root.title("Audiodownloader")
root.geometry("200x200")
app = Frame(root)
app.grid()
os.chdir('/media/')
Fileval = open('NextVal.txt','r')
Fileupdate = Fileval.read()
Fileval.close()
Fileupdate = int(Fileupdate)
Filect = Fileupdate/10000
Filect2 = str(Filect)+"0009"
Filecount = int(Filect2)
while Fileupdate <= Filecount:
root.title(Fileupdate)
url = 'http://www.yourfavoritewebsite.com/audio/encoded/'+str(Fileupdate)+'.mp3'
urllib.urlretrieve(url,str(Fileupdate)+'.mp3')
statinfo = os.stat(str(Fileupdate)+'.mp3')
if statinfo.st_size<10000L:
os.remove(str(Fileupdate)+'.mp3')
time.sleep(.01)
Fileupdate = Fileupdate+1
root.update_idletasks()
I'm trying to convert the original VB6 program over to Linux and make it much easier to use at the same time. Hence the lack of .mainloop being missing. This was my first real attempt at anything in Python at all hence the lack of def or classes. I'm trying to come back and finish this up after 1.5 months of doing nothing with it mostly due to not knowing how to. In research a little while ago I found this is WAY over my head. I haven't ever did anything with threads/sockets/client/server interaction so I'm purely an idiot in this case. Google anything on it and I just get brought right back here to stackoverflow.
Yes, I want 10 running copies of the program at the same time, to save time. I could do without the gui interface if it was possible for the program to report back to 'mother' so the mother could print on the screen the current value that is being searched. As well as if the child could report back when its finished and if it had any file that it downloaded successfully(versus downloaded and then erased due to being to small). I would use the successful download information to update Highest.txt for the next time the program got ran.
I think this may clarify things MUCH better...that or I don't understand the nature of using server/client interaction:) Only reason time.sleep is in the program was due to try to make sure that the files could get written before the next instance of the child program got ran. I didn't know for sure what kind of timing issue I may run into so I included those lines for safety.
This can be implemented using a simple client/server topology using the multiprocessing library. Using your mother/child terminology:
server.py
from multiprocessing.connection import Listener
# client
def child(conn):
while True:
msg = conn.recv()
# this just echos the value back, replace with your custom logic
conn.send(msg)
# server
def mother(address):
serv = Listener(address)
while True:
client = serv.accept()
child(client)
mother(('', 5000))
client.py
from multiprocessing.connection import Client
c = Client(('localhost', 5000))
c.send('hello')
print('Got:', c.recv())
c.send({'a': 123})
print('Got:', c.recv())
Run with
$ python server.py
$ python client.py
When you talk about using txt to pass information between programs, we first need to know what language you're using.
Within my knowledge of Java and Python achi viable despite laborious depensendo the amount of information that wants to work.
In python, you can use the library that comes with it for reading and writing txt and schedule execution, you can use the apscheduler.

Communication between two separate Python engines

The problem statement is as follows:
I am working with Abaqus, a program for analyzing mechanical problems. It is basically a standalone Python interpreter with its own objects etc. Within this program, I run a python script to set up my analysis (so this script can be modified). It also contains a method which has to be executed when an external signal is received. These signals come from the main script that I am running in my own Python engine.
For now, I have the following workflow:
The main script sets a boolean to True when the Abaqus script has to execute a specific function, and pickles this boolean into a file. The Abaqus script regularly checks this file to see whether the boolean has been set to true. If so, it does an analysis and pickles the output, so that the main script can read this output and act on it.
I am looking for a more efficient way to signal the other process to start the analysis, since there is a lot of unnecessary checking going on right know. Data exchange via pickle is not an issue for me, but a more efficient solution is certainly welcome.
Search results always give me solutions with subprocess or the like, which is for two processes started within the same interpreter. I have also looked at ZeroMQ since this is supposed to achieve things like this, but I think this is overkill and would like a solution in python. Both interpreters are running python 2.7 (although different versions)
Edit:
Like #MattP, I'll add this statement of my understanding:
Background
I believe that you are running a product called abaqus. The abaqus product includes a linked-in python interpreter that you can access somehow (possibly by running abaqus python foo.py on the command line).
You also have a separate python installation, on the same machine. You are developing code, possibly including numpy/scipy, to run on that python installation.
These two installations are different: they have different binary interpreters, different libraries, different install paths, etc. But they live on the same physical host.
Your objective is to enable the "plain python" programs, written by you, to communicate with one or more scripts running in the "Abaqus python" environment, so that those scripts can perform work inside the Abaqus system, and return results.
Solution
Here is a socket based solution. There are two parts, abqlistener.py and abqclient.py. This approach has the advantage that it uses a well-defined mechanism for "waiting for work." No polling of files, etc. And it is a "hard" API. You can connect to a listener process from a process on the same machine, running the same version of python, or from a different machine, or from a different version of python, or from ruby or C or perl or even COBOL. It allows you to put a real "air gap" into your system, so you can develop the two parts with minimal coupling.
The server part is abqlistener. The intent is that you would copy some of this code into your Abaqus script. The abq process would then become a server, listening for connections on a specific port number, and doing work in response. Sending back a reply, or not. Et cetera.
I am not sure if you need to do setup work for each job. If so, that would have to be part of the connection. This would just start ABQ, listen on a port (forever), and deal with requests. Any job-specific setup would have to be part of the work process. (Maybe send in a parameter string, or the name of a config file, or whatever.)
The client part is abqclient. This could be moved into a module, or just copy/pasted into your existing non-ABQ program code. Basically, you open a connection to the right host:port combination, and you're talking to the server. Send in some data, get some data back, etc.
This stuff is mostly scraped from example code on-line. So it should look real familiar if you start digging into anything.
Here's abqlistener.py:
# The below usage example is completely bogus. I don't have abaqus, so
# I'm just running python2.7 abqlistener.py [options]
usage = """
abacus python abqlistener.py [--host 127.0.0.1 | --host mypc.example.com ] \\
[ --port 2525 ]
Sets up a socket listener on the host interface specified (default: all
interfaces), on the given port number (default: 2525). When a connection
is made to the socket, begins processing data.
"""
import argparse
parser = argparse.ArgumentParser(description='Abacus listener',
add_help=True,
usage=usage)
parser.add_argument('-H', '--host', metavar='INTERFACE', default='',
help='Interface IP address or name, or (default: empty string)')
parser.add_argument('-P', '--port', metavar='PORTNUM', type=int, default=2525,
help='port number of listener (default: 2525)')
args = parser.parse_args()
import SocketServer
import json
class AbqRequestHandler(SocketServer.BaseRequestHandler):
"""Request handler for our socket server.
This class is instantiated whenever a new connection is made, and
must override `handle(self)` in order to handle communicating with
the client.
"""
def do_work(self, data):
"Do some work here. Call abaqus, whatever."
print "DO_WORK: Doing work with data!"
print data
return { 'desc': 'low-precision natural constants','pi': 3, 'e': 3 }
def handle(self):
# Allow the client to send a 1kb message (file path?)
self.data = self.request.recv(1024).strip()
print "SERVER: {} wrote:".format(self.client_address[0])
print self.data
result = self.do_work(self.data)
self.response = json.dumps(result)
print "SERVER: response to {}:".format(self.client_address[0])
print self.response
self.request.sendall(self.response)
if __name__ == '__main__':
print args
server = SocketServer.TCPServer((args.host, args.port), AbqRequestHandler)
print "Server starting. Press Ctrl+C to interrupt..."
server.serve_forever()
And here's abqclient.py:
usage = """
python2.7 abqclient.py [--host HOST] [--port PORT]
Connect to abqlistener on HOST:PORT, send a message, wait for reply.
"""
import argparse
parser = argparse.ArgumentParser(description='Abacus listener',
add_help=True,
usage=usage)
parser.add_argument('-H', '--host', metavar='INTERFACE', default='',
help='Interface IP address or name, or (default: empty string)')
parser.add_argument('-P', '--port', metavar='PORTNUM', type=int, default=2525,
help='port number of listener (default: 2525)')
args = parser.parse_args()
import json
import socket
message = "I get all the best code from stackoverflow!"
print "CLIENT: Creating socket..."
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print "CLIENT: Connecting to {}:{}.".format(args.host, args.port)
s.connect((args.host, args.port))
print "CLIENT: Sending message:", message
s.send(message)
print "CLIENT: Waiting for reply..."
data = s.recv(1024)
print "CLIENT: Got response:"
print json.loads(data)
print "CLIENT: Closing socket..."
s.close()
And here's what they print when I run them together:
$ python2.7 abqlistener.py --port 3434 &
[2] 44088
$ Namespace(host='', port=3434)
Server starting. Press Ctrl+C to interrupt...
$ python2.7 abqclient.py --port 3434
CLIENT: Creating socket...
CLIENT: Connecting to :3434.
CLIENT: Sending message: I get all the best code from stackoverflow!
CLIENT: Waiting for reply...
SERVER: 127.0.0.1 wrote:
I get all the best code from stackoverflow!
DO_WORK: Doing work with data!
I get all the best code from stackoverflow!
SERVER: response to 127.0.0.1:
{"pi": 3, "e": 3, "desc": "low-precision natural constants"}
CLIENT: Got response:
{u'pi': 3, u'e': 3, u'desc': u'low-precision natural constants'}
CLIENT: Closing socket...
References:
argparse, SocketServer, json, socket are all "standard" Python libraries.
To be clear, my understanding is that you are running Abaqus/CAE via a Python script as an independent process (let's call it abq.py), which checks for, opens, and reads a trigger file to determine if it should run an analysis. The trigger file is created by a second Python process (let's call it main.py). Finally, main.py waits to read the output file created by abq.py. You want a more efficient way to signal abq.py to run an analysis, and you're open to different techniques to exchange data.
As you mentioned, subprocess or multiprocessing might be an option. However, I think a simpler solution is to combine your two scripts, and optionally use a callback function to monitor the solution and process your output. I'll assume there is no need to have abq.py constantly running as a separate process, and that all analyses can be started from main.py whenever it is appropriate.
Let main.py have access to the Abaqus Mdb. If it's already built, you open it with:
mdb = openMdb(FileName)
A trigger file is not needed if main.py starts all analyses. For example:
if SomeCondition:
j = mdb.Job(name=MyJobName, model=MyModelName)
j.submit()
j.waitForCompletion()
Once complete, main.py can read the output file and continue. This is straightforward if the data file was generated by the analysis itself (e.g. .dat or .odb files). OTH, if the output file is generated by some code in your current abq.py, then you can probably just include it in main.py instead.
If that doesn't provide enough control, instead of the waitForCompletion method you can add a callback function to the monitorManager object (which is automatically created when you import the abaqus module: from abaqus import *). This allows you to monitor and respond to various messages from the solver, such as COMPLETED, ITERATION, etc. The callback function is defined like:
def onMessage(jobName, messageType, data, userData):
if messageType == COMPLETED:
# do stuff
else:
# other stuff
Which is then added to the monitorManager and the job is called :
monitorManager.addMessageCallback(jobName=MyJobName,
messageType=ANY_MESSAGE_TYPE, callback=onMessage, userData=MyDataObj)
j = mdb.Job(name=MyJobName, model=MyModelName)
j.submit()
One of the benefits to this approach is that you can pass in a Python object as the userData argument. This could potentially be your output file, or some other data container. You could probably figure out how to process the output data within the callback function - for example, access the Odb and get the data, then do any manipulations as needed without needing the external file at all.
I agree with the answer, except for some minor syntax problems.
defining instance variables inside the handler is a no no. not to mention they are not being defined in any sort of init() method. Subclass TCPServer and define your instance variables in TCPServer.init(). Everything else will work the same.

python simple threading won't ends without doing anything (maybe)

When i run the following code (using "sudo python servers.py") the process seem to just finish immediately with just printing "test".
why doesn't the functions "proxy_server" won't run ? or maybe they do but i do not realize that. (because the first line in proxy function doesn't print anything)
this is an impotent code, i didn't want to put unnecessary content, yet it still demonstrate my problem:
import os,sys,thread,socket,select,struct,time
HTTP_PORT = 80
FTP_PORT=21
FTP_DATA_PORT = 20
IP_IN = '10.0.1.3'
IP_OUT = '10.0.3.3'
sys_http = 'http_proxy'
sys_ftp = 'ftp_proxy'
sys_ftp_data = 'ftp_data_proxy'
def main():
try:
thread.start_new_thread(proxy_server, (HTTP_PORT, IP_IN,sys_http,http_handler))
thread.start_new_thread(proxy_server, (FTP_PORT, IP_IN,sys_ftp,http_handler))
thread.start_new_thread(proxy_server, (FTP_DATA_PORT, IP_OUT,sys_ftp_data,http_handler))
print "test"
except e:
print 'Error!'
sys.exit(1)
def proxy_server(host,port,fileName,handler):
print "Proxy Server Running on ",host,":",port
def http_handler(src,sock):
return ''
if __name__ == '__main__':
main()
What am i missing or doing wrong ?
First, you have indentation problems related to using mixed tabs and spaces for indentation. While they didn't cause your code to misbehave in this particular case, they will cause you problems later if you don't stick to consistently using one or the other. They've already broken the displayed indentation in your question; see the print "test" line in main, which looks misaligned.
Second, instead of the low-level thread module, you should be using threading. Your problem is occurring because, as documented in the thread module documentation,
When the main thread exits, it is system defined whether the other threads survive. On SGI IRIX using the native thread implementation, they survive. On most other systems, they are killed without executing try ... finally clauses or executing object destructors.
threading threads let you explicitly define whether other threads should survive the death of the main thread, and default to surviving. In general, threading is much easier to use correctly.

UNIX named PIPE end of file

I'm trying to use a unix named pipe to output statistics of a running service. I intend to provide a similar interface as /proc where one can see live stats by catting a file.
I'm using a code similar to this in my python code:
while True:
f = open('/tmp/readstatshere', 'w')
f.write('some interesting stats\n')
f.close()
/tmp/readstatshere is a named pipe created by mknod.
I then cat it to see the stats:
$ cat /tmp/readstatshere
some interesting stats
It works fine most of the time. However, if I cat the entry several times in quick successions, sometimes I get multiple lines of some interesting stats instead of one. Once or twice, it has even gone into an infinite loop printing that line forever until I killed it. The only fix that I've got so far is to put a delay of let's say 500ms after f.close() to prevent this issue.
I'd like to know why exactly this happens and if there is a better way of dealing with it.
Thanks in advance
A pipe is simply the wrong solution here. If you want to present a consistent snapshot of the internal state of your process, write that to a temporary file and then rename it to the "public" name. This will prevent all issues that can arise from other processes reading the state while you're updating it. Also, do NOT do that in a busy loop, but ideally in a thread that sleeps for at least one second between updates.
What about a UNIX socket instead of a pipe?
In this case, you can react on each connect by providing fresh data just in time.
The only downside is that you cannot cat the data; you'll have to create a new socket handle and connect() to the socket file.
MYSOCKETFILE = '/tmp/mysocket'
import socket
import os
try:
os.unlink(MYSOCKETFILE)
except OSError: pass
s = socket.socket(socket.AF_UNIX)
s.bind(MYSOCKETFILE)
s.listen(10)
while True:
s2, peeraddr = s.accept()
s2.send('These are my actual data')
s2.close()
Program querying this socket:
MYSOCKETFILE = '/tmp/mysocket'
import socket
import os
s = socket.socket(socket.AF_UNIX)
s.connect(MYSOCKETFILE)
while True:
d = s.recv(100)
if not d: break
print d
s.close()
I think you should use fuse.
it has python bindings, see http://pypi.python.org/pypi/fuse-python/
this allows you to compose answers to questions formulated as posix filesystem system calls
Don't write to an actual file. That's not what /proc does. Procfs presents a virtual (non-disk-backed) filesystem which produces the information you want on demand. You can do the same thing, but it'll be easier if it's not tied to the filesystem. Instead, just run a web service inside your Python program, and keep your statistics in memory. When a request comes in for the stats, formulate them into a nice string and return them. Most of the time you won't need to waste cycles updating a file which may not even be read before the next update.
You need to unlink the pipe after you issue the close. I think this is because there is a race condition where the pipe can be opened for reading again before cat finishes and it thus sees more data and reads it out, leading to multiples of "some interesting stats."
Basically you want something like:
while True:
os.mkfifo(the_pipe)
f = open(the_pipe, 'w')
f.write('some interesting stats')
f.close()
os.unlink(the_pipe)
Update 1: call to mkfifo
Update 2: as noted in the comments, there is a race condition in this code as well with multiple consumers.

developing for modularity & reusability: how to handle While True loops?

I've been playing around with the pybluez module recently to scan for nearby Bluetooth devices. What I want to do now is extend the program to also find nearby WiFi client devices.
The WiFi client scanner will have need to have a While True loop to continually monitor the airwaves. If I were to write this as a straight up, one file program, it would be easy.
import ...
while True:
client = scan()
print client['mac']
What I want, however, is to make this a module. I want to be able to reuse it later and, possible, have others use it too. What I can't figure out is how to handle the loop.
import mymodule
scan()
Assuming the first example code was 'mymodule', this program would simply print out the data to stdout. I would want to be able to use this data in my program instead of having the module print it out...
How should I code the module?
I think the best approach is going to be to have the scanner run on a separate thread from the main program. The module should have methods that start and stop the scanner, and another that returns the current access point list (using a lock to synchronize). See the threading module.
How about something pretty straightforward like:
mymodule.py
import ...
def scanner():
while True:
client = scan()
yield client['mac']
othermodule.py
import mymodule
for mac in mymodule.scanner():
print mac
If you want something more useful than that, I'd also suggest a background thread as #kindall did.
Two interfaces would be useful.
scan() itself, which returned a list of found devices, such that I could call it to get an instantaneous snapshot of available bluetooth. It might take a max_seconds_to_search or a max_num_to_return parameter.
A "notify on found" function that accepted a callback. For instance (maybe typos, i just wrote this off the cuff).
def find_bluetooth(callback_func, time_to_search = 5.0):
already_found = []
start_time = time.clock()
while 1:
if time.clock()-start_time > 5.0: break
found = scan()
for entry in found:
if entry not in already_found:
callback_func(entry)
already_found.append(entry)
which would be used by doing this:
def my_callback(new_entry):
print new_entry # or something more interesting...
find_bluetooth(my_callback)
If I get your question, you want scan() in a separate file, so that it can be reused later.
Create utils.py
def scan():
# write code for scan here.
Create WiFi.py
import utils
def scan_wifi():
while True:
cli = utils.scan()
...
return

Categories