python program that blocks takes 6% of CPU?

python program that blocks takes 6% of CPU? - python

I have a program that uses this library basically does something very simple, like this
receiver = multicast.MulticastUDPReceiver ("192.168.0.2", symbolMCIPAddrStr, symbolMCPort )
while True:
print 'Spinning'
try:
b = MD()
data = receiver.read(1024)
The receiver socket blocks until data comes in, so the print 'Spinning' only prints once until data is received on the socket. When I ask the OS how much CPU this process is taking, even though it is waiting on the receive, it comes back with:
[idf#node1 ~]$ ps -p 4294 -o %cpu,%mem,cmd
%CPU %MEM CMD
6.3 0.4 python ./mc.py -s EUR/USD
[idf#node1 ~]$
In fact, if I run several of these processes, my computer with two CPU and 8 cores each, all cores go to 100% usage and the computer becomes unusable.
I must misunderstand python's notion of "blocking" because even a do nothing process that should basically be sleeping is taking up lots of CPU.
Is there a more correct way to write this so that programs that are basically waiting for i/o [interrupt-driven] give up the CPU?

You haven't posted a complete example so it's difficult to say for sure what's happening.
However, I see that there's a try block inside your loop and your networking code is inside the try block. I don't know what your exception handling does. However, I'm guessing it does something like unintentionally swallowing an important error. Your loop then runs again and probably generates the same error. In this way, the program is actually busy-looping even though you thought it was asleep, blocking on I/O.

Related

Stop an executable called via Python after specified amount of time

I am working with a groundwater modeling executable (HYDRUS1D) which I call with a Python script. I want to do some Monte Carlo runs but sometimes the program gets hung up and does not converge for extended periods of time.
Is there a way to give the executable a certain amount of time to run, cancel it if it goes over this time, and then start a new simulation all without interrupting the Python script? The simulation should take no more than 3-5 seconds, so I am hoping to give it a maximum of 10 seconds to finish.
I first run a function that changes some input parameters to the model, then execute Hydrus via the 'run_single_sim' function:
for value in n_variations_21:
for value2 in n_variations_23:
write_hydraulic_params('foo',layers,value,value2)
run_single_sim()
Where run_single_sim() executes Hydrus via os.system:
def run_single_sim():
os.system('./hydrus LEVEL_01.DIR')
I have tried a few solutions involving threading such as this, and this; but it seems like my script gets stuck on the os.system call and therefore cannot check to see how long the thread has been running or kill the thread after sleeping the script for some specified amount of time.

You asked "how to stop an executable called via Python ...", but I feel
this question is simply about "how to stop an executable".
What's interesting is that we have a child that might misbehave.
The parent is uninteresting, could be rust, ruby, random other language.
The timeout issue you pose is a sensible question,
and there's a stock answer for it, in the GNU coreutils package.
Instead of
os.system('./hydrus LEVEL_01.DIR')
you want
os.system('timeout 10 ./hydrus LEVEL_01.DIR')
Here is a quick demo, using a simpler command than hydrus.
$ timeout 2 sleep 1; echo $?
0
$
$ timeout 2 sleep 3; echo $?
124
As an entirely separate matter, prefer check_output()
over the old os.system().
You quoted a pair of answer articles that deal with threading.
But you're spawning a separate child process,
with no shared memory, so threading's not relevant here.
We wish to eventually send a SIGTERM signal to an ill behaved process,
and we hope it obeys the signal by quickly dropping out.
Timing out a child that explicitly ignores such signals would
be a slightly stickier problem.
An uncatchable SIGKILL can be sent
by using the --kill-after=duration flag.

receiving data from java program that calls python script is too slow

I have a python script that is called from a java program
The java program feeds data to the python script sys.stdin and the java program receives data from the python process outputstream .
What is know is this .
running the command 'python script.py' from the java program on 10MB of data takes about 35 seconds.
However running the commands 'python script.py > temp.data ' and then cat temp.data is significantly faster.
The order of magnitude of performance is even more drastic as the data gets larger.
In order to address this , I am thinking maybe there is a way to change the sys.stdout to mimic what I am doing.
Or maybe I can pipe the python script output to a virtual file .
Any recommendations ?

This is probably a buffering problem when you have the Java program writing to one filehandle and reading from another filehandle. The order of those in the Java and the size of the writes is suboptimal and it's slowing itself down.
I would try "python -u script.py" to see what it does when you ask python to unbuffer, which should be slower but might trick your calling program into racing a different way, perhaps faster.
The larger fix, I think, is to batch your code, as you are testing with, and read the resulting file, or to use posix select() or filehandle events to handle how your java times its writes and reads.

Different thread limit between python 3.4 and 3.5?

I have a python program that opens threads in a loop, each thread runs a method that sends an http packet.
The program is supposed to emulate heavy traffic on a certain server we're working on.
The thread creation code looks something like this:
while True:
try:
num_of_connections += 1
thread_obj = HTTP_Tester.Threaded_Test(ip)
thread_obj.start()
except RuntimeError as e:
print("Runtime Error!")
So again, the thread_obj is running a method which sends HTTP requests to the ip it is given, nothing fancy.
When running this code under python 3.5, I am able to open around 880 threads until the RuntimeError "can't open a new thread" is thrown.
When running this code under python 3.4 however, the number of threads keeps growing and growing - I got up to 2000+ threads until the machine it was running on became unresponsive.
I check the amount of threads that are open by looking at the num_of_connections counter and also using TCPView to verify in fact that the number of Sockets is actually growing. Under python 3.4 TCPView actually shows 2000+ sockets open for the program so I deduce that there are in fact 2000+ threads open
I googled around and saw that people suggested the threading.stack_size is the culprit - not here, I changed the size but the number of possible threads doesn't change either way
The question is, how come with 3.5 the limit is so low, whereas in 3.4 its (presumably) high? Also can I change the limit? I would prefer to use 3.5 but want to open as many threads as I can
Thank you!

Constantly monitor a program/process using Python

I am trying to constantly monitor a process which is basically a Python program. If the program stops, then I have to start the program again. I am using another Python program to do so.
For example, say I have to constantly run a process called run_constantly.py. I initially run this program manually, which writes its process ID to the file "PID" (in the location out/PROCESSID/PID).
Now I run another program which has the following code to monitor the program run_constantly.py from a Linux environment:
def Monitor_Periodic_Process():
TIMER_RUNIN = 1800
foo = imp.load_source("Run_Module","run_constantly.py")
PROGRAM_TO_MONITOR = ['run_constantly.py','out/PROCESSID/PID']
while(1):
# call the function checkPID to see if the program is running or not
res = checkPID(PROGRAM_TO_MONITOR)
# if res is 0 then program is not running so schedule it
if (res == 0):
date_time = datetime.now()
scheduler.add_cron_job(foo.Run_Module, year=date_time.year, day=date_time.day, month=date_time.month, hour=date_time.hour, minute=date_time.minute+2)
scheduler.start()
scheduler.get_jobs()
time.sleep(TIMER_NOT_RUNIN)
continue
else:
#the process is running sleep and then monitor again
time.sleep(TIMER_RUNIN)
continue
I have not included the checkPID() function here. checkPID() basically checks if the process ID still exists (i.e. if the program is still running) and if it does not exist, it returns 0. In the above program, I check if res == 0, and if so, then I use Python's scheduler to schedule the program. However, the major problem that I am currently facing is that the process ID of this program and the run_constantly.py program turns to be same once I schedule the run_constantly.py using the scheduler.add_cron_job() function. So if the program run_constantly.py crashes, the following program still thinks that the run_constantly.py is running (since both process IDs are same), and therefore continues to go into the else loop to sleep and monitor again.
Can someone tell me how to solve this issue? Is there a simple way to constantly monitor a program and reschedule it when it has crashed?

There are many programs that can do this.
On Ubuntu there is upstart (installed by default)
Lots of people like http://supervisord.org/
monit as mentioned by #nathan
If you are looking for a python alternative there is a library that has just been released called circus which looks interesting.
And pretty much every linux distro probably has one of these built in.
The choice is really just down to which one you like better, but you would be far better off using one of these than writing it yourself.
Hope that helps

If you are willing to control the monitored program directly from python instead of using cron, have a look at the subprocess module :
The subprocess module allows you to spawn new processes,
connect to their input/output/error pipes, and obtain their return codes.
Check examples like track process status with python on SO for examples and references.

You could just use monit
http://mmonit.com/monit/
It monitors processes and restarts them (and other things.)

I thought I'd add a more versatile solution, which is one that I personally use all the time as well.
It's name is Immortal (source is at https://github.com/immortal/immortal)
To have it monitor and instantly restart a program if it stops, simply run the following command:
immortal <command>
So in your case I would run run_constantly.py like so:
immortal python run_constantly.py
The command ps aux | grep run_constantly.py should return 2 process IDs, one for the Immortal command, and one for the separate command Immortal started (just the regular command. As long as the Immortal process is running, run_constantly.py will stay running.

Python - simple reading lines from a pipe

I'm trying to read lines from a pipe and process them, but I'm doing something silly and I can't figure out what. The producer is going to keep producing lines indefinitely, like this:
producer.py
import time
while True:
print 'Data'
time.sleep(1)
The consumer just needs to check for lines periodically:
consumer.py
import sys, time
while True:
line = sys.stdin.readline()
if line:
print 'Got data:', line
else:
time.sleep(1)
When I run this in the Windows shell as python producer.py | python consumer.py, it just sleeps forever (never seems to get data?) It seems that maybe the problem is that the producer never terminates, since if I send a finite amount of data then it works fine.
How can I get the data to be received and show up for the consumer? In the real application, the producer is a C++ program I have no control over.

Some old versions of Windows simulated pipes through files (so they were prone to such problems), but that hasn't been a problem in 10+ years. Try adding a
sys.stdout.flush()
to the producer after the print, and also try to make the producer's stdout unbuffered (by using python -u).
Of course this doesn't help if you have no control over the producer -- if it buffers too much of its output you're still going to wait a long time.
Unfortunately - while there are many approaches to solve that problem on Unix-like operating systems, such as pyexpect, pexpect, exscript, and paramiko, I doubt any of them works on Windows; if that's indeed the case, I'd try Cygwin, which puts enough of a Linux-like veneer on Windows as to often enable the use of Linux-like approaches on a Windows box.

This is about I/O that is bufferized by default with Python. Pass -u option to the interpreter to disable this behavior:
python -u producer.py | python consumer.py
It fixes the problem for me.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.