How to reuse subprocess - python

There's a Windows Python application.
The application must gather several data about network configuration to build internal data structures.
The process takes now 10 seconds, making it barely usable.
Those 10 seconds are spent on creating separate subprocesses, which call powershell to get the data.
I think process creation is heavy on windows, so wanted to reuse single process to see if it makes a difference.
I'm using code similar to:
if not ps:
ps = sp.Popen([conf.prog.powershell],
stdout=sp.PIPE,
stdin=sp.PIPE,
universal_newlines=True)
And later:
ps.stdin.write(' '.join(cmd + ['|', 'select %s' % ', '.join(fields), '|', 'fl', '\n']))
# ... ps.stdout.read()/readline()
It hangs, so I've searched for alternatives.
They were:
Use communicate in subprocess to avoid any deadlocks - I can't do that, because communicate waits for the process to finish and I don't want the process to finish.
Use pexpect, but it's not fully functional on Windows and by using it powershell console took over python console.
Use own threads for read/write (inspired by http://eyalarubas.com/python-subproc-nonblock.html) - subsequent commands sent to subprocess instance didn't cause any action.
I couldn't find any Python library to not use processes to get powershell data (COM?) other than reading registry.
The libraries I found (netifaces, psutil) didn't offer the requested functionality.
So, does anybody have a working code example for the mentioned case (or can provide alternative way to get the information)?
Python 2.7, but I don't think it matters
OS: Win7/Win10
Regards,
Robert

Related

python 3 Popen (semi-)interactive back-and-forth communication with a process, Popen.communicate() vs Popen.stdin/Popen.stdout

All of the examples I see for interacting with a process using Python 3's subprocess.Popen use Popen.communicate("input_text") exactly once before calling Popen.communicate() to grab the standard output and ending the program. I have a few programs that I want to script that require human intervention via stdin, so I want to automate them since the prompts are predictable.
For example, an in-house licensing application requires us to pass the application information via prompts (not from the command line) relating to the customer's unique ID (4 digit integer), the number of users, etc. And then it has to be done 30 times (random number), each one for a different product, identified by another integer.
Scripting that is easy if only I can learn how to do a sustained back-and-forth using Popen. Should I be using Popen.communicate() or should I be using Popen.stdout and Popen.stdin() and what's the difference between both?
Popen.communicate will block until the subprocess has completed or failed and only then returns information from stdout and stderr. So this is not what you need.
stdin, stdout & stderr are essentially special files belonging to a process that you can read from or write to, same as any other file, but they can provide an interface between processes if you pipe information into them.
I have recently had to implement something similar to what you described, the only way I was able to retrieve information through stdout of the "client" process was by using the pty module. I will link you two answers that helped me, however please note that these solutions are Posix only and using shell=True is a security risk. https://stackoverflow.com/a/5413588/533362 & https://stackoverflow.com/a/13605804/3565382.

Are IPython engines independent processes?

From the IPython Architecture Overview documentation we know that ...
The IPython engine is a Python instance that takes Python commands over a network connection.
Given that it is a Python instance does that imply that these engines are stand alone processes? I can manually load a set of engines via a command like ipcluster start -n 4. Doing thus is the creation of engines considered the creation of child processes of some parent process or just a means to kick off a set of independent processes that rely on IPC communication to get their work done? I can also invoke an engine via the ipengine command, which is surely standalone as its entered directly to the OS command line with no relation to anything really.
As background I'm trying to drill into how the many IPython engines manipulated through a Client from a python script will interact with another process kicked off in that script.
Here's a simple way to find out the processes involved, print the list of current processes before I fire off the controller and engines and then print the list after they're fired off. There's a wmic command to get the job done...
C:\>wmic process get description,executablepath
Interestingly enough the controller gets 5 python processes going, and each engine creates one additional python process. So from this investigation I also learned that an engine is its own process, as well as the controller...
C:\>wmic process get description,executablepath | findstr ipengine
ipengine.exe C:\Python34\Scripts\ipengine.exe
ipengine.exe C:\Python34\Scripts\ipengine.exe
C:\>wmic process get description,executablepath | findstr ipcontroller
ipcontroller.exe C:\Python34\Scripts\ipcontroller.exe
From the looks of it they all seem standalone, though I don't think the OS's running process list carries any information about how the processes are related as far as the parent/child relationship is concerned. That may be a developer only formalism that has no representation that's tracked in the OS, but I don't know about these sort of internals to know either way.
Here's a definitive quote from MinRK that addresses this question directly:
"Every engine is its own isolated process...Each kernel is a separate
process and can be on any machine... It's like you started a terminal IPython session, and every engine is a separate IPython session. If you do a=5 in this one, a=10 in that one, this guy has 10 this guy has 5."
Here's further definitive validation, inspired by a great SE Hot Network Question on ServerFault that mentioned use of ProcessExplorer which actually tracks parent child processes...
Process Explorer is a Sysinternals tool maintained by Microsoft. It
can display the command line of the process in the process's
properties dialog as well as the parent that launched it, though the
name of that process may no longer be available.
--Corrodias
If I fire off more engines in another command window that section of ProcessExplorer just duplicates exactly as you see in the screenshot.
And just for the sake of completeness, here' what the command ipcluster start --n=5 looks like...

Feasibility of using pipe for ruby-python communication

Currently, I have two programs, one running on Ruby and the other in Python. I need to read a file in Ruby but I need first a library written in Python to parse the file. Currently, I use XMLRPC to have the two programs communicate. Porting the Python library to Ruby is out of question. However, I find and read that using XMLRPC has some performance overhead. Recently, I read that another solution for the Ruby-Python conundrum is the use of pipes. So I tried to experiment on that one. For example, I wrote this master script in ruby:
(0..2).each do
slave = IO.popen(['python','slave.py'],mode='r+')
slave.write "master"
slave.close_write
line = slave.readline
while line do
sleep 1
p eval line
break if slave.eof
line = slave.readline
end
end
The following is the Python slave:
import sys
cmd = sys.stdin.read()
while cmd:
x = cmd
for i in range(0,5):
print "{'%i'=>'%s'}" % (i, x)
sys.stdout.flush()
cmd = sys.stdin.read()
Everything seems to work fine:
~$ ruby master.rb
{"0"=>"master"}
{"1"=>"master"}
{"2"=>"master"}
{"3"=>"master"}
{"4"=>"master"}
{"0"=>"master"}
{"1"=>"master"}
{"2"=>"master"}
{"3"=>"master"}
{"4"=>"master"}
{"0"=>"master"}
{"1"=>"master"}
{"2"=>"master"}
{"3"=>"master"}
{"4"=>"master"}
My question is, is it really feasible to implement the use of pipes for working with objects between Ruby and Python? One consideration is that there may be multiple instances of master.rb running. Will concurrency be an issue? Can pipes handle extensive operations and objects to be passed in between? If so, would it be a better alternative for RPC?
Yes. No. If you implement it, yes. Depends on what your application needs.
Basically if all you need is simple data passing pipes are fine, if you need to be constantly calling functions on objects in your remote process then you'll probably be better of using some form of existing RPC instead of reinventing the wheel. Whether that should be XMLRPC or something else is another matter.
Note that RPC will have to use some underlying IPC mechanism, which could well be pipes. but might also be sockets, message queues, shared memory, whatever.

How to start daemon process from python on windows?

Can my python script spawn a process that will run indefinitely?
I'm not too familiar with python, nor with spawning deamons, so I cam up with this:
si = subprocess.STARTUPINFO()
si.dwFlags = subprocess.CREATE_NEW_PROCESS_GROUP | subprocess.CREATE_NEW_CONSOLE
subprocess.Popen(executable, close_fds = True, startupinfo = si)
The process continues to run past python.exe, but is closed as soon as I close the cmd window.
Using the answer Janne Karila pointed out this is how you can run a process that doen't die when its parent dies, no need to use the win32process module.
DETACHED_PROCESS = 8
subprocess.Popen(executable, creationflags=DETACHED_PROCESS, close_fds=True)
DETACHED_PROCESS is a Process Creation Flag that is passed to the underlying CreateProcess function.
This question was asked 3 years ago, and though the fundamental details of the answer haven't changed, given its prevalence in "Windows Python daemon" searches, I thought it might be helpful to add some discussion for the benefit of future Google arrivees.
There are really two parts to the question:
Can a Python script spawn an independent process that will run indefinitely?
Can a Python script act like a Unix daemon on a Windows system?
The answer to the first is an unambiguous yes; as already pointed out; using subprocess.Popen with the creationflags=subprocess.CREATE_NEW_PROCESS_GROUP keyword will suffice:
import subprocess
independent_process = subprocess.Popen(
'python /path/to/file.py',
creationflags=subprocess.CREATE_NEW_PROCESS_GROUP
)
Note that, at least in my experience, CREATE_NEW_CONSOLE is not necessary here.
That being said, the behavior of this strategy isn't quite the same as what you'd expect from a Unix daemon. What constitutes a well-behaved Unix daemon is better explained elsewhere, but to summarize:
Close open file descriptors (typically all of them, but some applications may need to protect some descriptors from closure)
Change the working directory for the process to a suitable location to prevent "Directory Busy" errors
Change the file access creation mask (os.umask in the Python world)
Move the application into the background and make it dissociate itself from the initiating process
Completely divorce from the terminal, including redirecting STDIN, STDOUT, and STDERR to different streams (often DEVNULL), and prevent reacquisition of a controlling terminal
Handle signals, in particular, SIGTERM.
The reality of the situation is that Windows, as an operating system, really doesn't support the notion of a daemon: applications that start from a terminal (or in any other interactive context, including launching from Explorer, etc) will continue to run with a visible window, unless the controlling application (in this example, Python) has included a windowless GUI. Furthermore, Windows signal handling is woefully inadequate, and attempts to send signals to an independent Python process (as opposed to a subprocess, which would not survive terminal closure) will almost always result in the immediate exit of that Python process without any cleanup (no finally:, no atexit, no __del__, etc).
Rolling your application into a Windows service, though a viable alternative in many cases, also doesn't quite fit. The same is true of using pythonw.exe (a windowless version of Python that ships with all recent Windows Python binaries). In particular, they fail to improve the situation for signal handling, and they cannot easily launch an application from a terminal and interact with it during startup (for example, to deliver dynamic startup arguments to your script, say, perhaps, a password, file path, etc), before "daemonizing". Additionally, Windows services require installation, which -- though perfectly possible to do quickly at runtime when you first call up your "daemon" -- modifies the user's system (registry, etc), which would be highly unexpected if you're coming from a Unix world.
In light of that, I would argue that launching a pythonw.exe subprocess using subprocess.CREATE_NEW_PROCESS_GROUP is probably the closest Windows equivalent for a Python process to emulate a traditional Unix daemon. However, that still leaves you with the added challenge of signal handling and startup communications (not to mention making your code platform-dependent, which is always frustrating).
That all being said, for anyone encountering this problem in the future, I've rolled a library called daemoniker that wraps both proper Unix daemonization and the above strategy. It also implements signal handling (for both Unix and Windows systems), and allows you to pass objects to the "daemon" process using pickle. Best of all, it has a cross-platform API:
from daemoniker import Daemonizer
with Daemonizer() as (is_setup, daemonizer):
if is_setup:
# This code is run before daemonization.
do_things_here()
# We need to explicitly pass resources to the daemon; other variables
# may not be correct
is_parent, my_arg1, my_arg2 = daemonizer(
path_to_pid_file,
my_arg1,
my_arg2
)
if is_parent:
# Run code in the parent after daemonization
parent_only_code()
# We are now daemonized, and the parent just exited.
code_continues_here()
For that purpose you could daemonize your python process or as you are using windows environment you would like to run this as a windows service.
You know i like to hate posting only web-links:
But for more information according to your requirement:
A simple way to implement Windows Service. read all comments it will resolve any doubt
If you really want to learn more
First read this
what is daemon process or creating-a-daemon-the-python-way
update:
Subprocess is not the right way to achieve this kind of thing

Python - simple reading lines from a pipe

I'm trying to read lines from a pipe and process them, but I'm doing something silly and I can't figure out what. The producer is going to keep producing lines indefinitely, like this:
producer.py
import time
while True:
print 'Data'
time.sleep(1)
The consumer just needs to check for lines periodically:
consumer.py
import sys, time
while True:
line = sys.stdin.readline()
if line:
print 'Got data:', line
else:
time.sleep(1)
When I run this in the Windows shell as python producer.py | python consumer.py, it just sleeps forever (never seems to get data?) It seems that maybe the problem is that the producer never terminates, since if I send a finite amount of data then it works fine.
How can I get the data to be received and show up for the consumer? In the real application, the producer is a C++ program I have no control over.
Some old versions of Windows simulated pipes through files (so they were prone to such problems), but that hasn't been a problem in 10+ years. Try adding a
sys.stdout.flush()
to the producer after the print, and also try to make the producer's stdout unbuffered (by using python -u).
Of course this doesn't help if you have no control over the producer -- if it buffers too much of its output you're still going to wait a long time.
Unfortunately - while there are many approaches to solve that problem on Unix-like operating systems, such as pyexpect, pexpect, exscript, and paramiko, I doubt any of them works on Windows; if that's indeed the case, I'd try Cygwin, which puts enough of a Linux-like veneer on Windows as to often enable the use of Linux-like approaches on a Windows box.
This is about I/O that is bufferized by default with Python. Pass -u option to the interpreter to disable this behavior:
python -u producer.py | python consumer.py
It fixes the problem for me.

Categories