Command fails in python, but not in terminal

Command fails in python, but not in terminal - python

I'm running python on linux, and when I run my script with:
os.system('v4l2-ctl -c exposure_auto=1')
I get this:
VIDIOC_S_EXT_CTRLS: failed: Input/output error
exposure_auto: Input/output error
When I run this command from terminal with my default user, no output/error appears.
Why is this failing when running the script, but not in terminal?
Edit: Corrected the code and error output.

When a program like this dies with a mysterious error, it means that something about its environment when run beneath Python is subtly different, in a way that matters to the special IO calls that it is making. The question is: what could possibly be different? I just, as a test, ran the basic cat command from a shell — letting it sit there so that I could inspect its state before pressing Control-D to exit it again — and then ran it from the os.system() function in Python. In both cases, lsof shows that it has exactly the same files open and terminal connections made:
$ lsof -p 7573
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
cat 7573 brandon cwd DIR 0,24 45056 131082 /home/brandon
cat 7573 brandon rtd DIR 8,2 4096 2 /
cat 7573 brandon txt REG 8,2 46884 661873 /bin/cat
cat 7573 brandon mem REG 8,2 2919792 393288 /usr/lib/locale/locale-archive
cat 7573 brandon mem REG 8,2 1779492 270509 /lib/i386-linux-gnu/libc-2.17.so
cat 7573 brandon mem REG 8,2 134376 270502 /lib/i386-linux-gnu/ld-2.17.so
cat 7573 brandon 0u CHR 136,19 0t0 22 /dev/pts/19
cat 7573 brandon 1u CHR 136,19 0t0 22 /dev/pts/19
cat 7573 brandon 2u CHR 136,19 0t0 22 /dev/pts/19
In your case the command might run and exit so quickly that it is hard for you to catch it in mid-run with lsof to see what it looks like. Really, what you need to do is run it both ways under strace and figure out which system call is failing, and why.
strace -o trace-good.log v4l2-ctl -c exposure_auto=1
trace-bad.log python my-script.py # that has strace -o in its system() call!
The logs will be long, but using grep on them, or opening them in your less pager and using / and ? to search back and forth (and n and N to keep searching once you have entered a search phrase with / or ?), can help you jump around really quickly.
Look near the bottom of trace-bad.log for the system call that is actually giving the error. Then look in trace-good.log for the same call when it succeeds and post the difference here for us.

Related

Python subprocess call hungs when running rpm2cpio

I'm running the below command using python subprocess to extract files from rpm.
But the command failes when the rpm size is more than 25 - 30 MB. Tried the command using Popen, call, with stdout as PIPE and os.system as well. This command is working fine when i run it in shell directly. The problem is only when i invoke this by some means from Python
Command:
rpm2cpio <rpm_name>.rpm| cpio -idmv
I did an strace on the process id and found that its always hung on some write system call
ps -ef | grep cpio
root 4699 4698 4 11:05 pts/0 00:00:00 rpm2cpio kernel-2.6.32-573.26.1.el6.x86_64.rpm
root 4700 4698 0 11:05 pts/0 00:00:00 cpio -idmv
strace -p 4699
Process 4699 attached
write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0rc_pixelview_new"..., 8192
strace -p 4700
Process 4700 attached
write(2, "./lib/modules/2.6.32-573.26.1.el"..., 94
I have 2 questions:
Can someone figure out what is the problem here? Why is it failing when the rpm size is more than 25 MB.
Is there any other way i can extract the rpm contents from python?

Your output pipe is full. The python docs note in many places not to do what you are doing:
Do not use stdout=PIPE or stderr=PIPE with this function as that can deadlock based on the child process output volume. Use Popen with the communicate() method when you need pipes.

If all you want is the payload of a *.rpm package, then do the computations to find the beginning of the compressed cpio payload and do the operations directly in python.
See How do I extract the contents of an rpm? for a rpm2cpio.sh shell script that documents the necessary computations. The only subtlety is ensuring that the padding (needed for alignment) between the signature and metadata headers is correct.

`ps -ef` shows running process twice if started with `subprocess.Popen`

I use the following snippet in a larger Python program to spawn a process in background:
import subprocess
command = "/media/sf_SharedDir/FOOBAR"
subprocess.Popen(command, shell=True)
After that I wanted to check whether the process was running when my Python program returned.
Output of ps -ef | grep -v grep | grep FOOBAR:
ap 3396 937 0 16:08 pts/16 00:00:00 /bin/sh -c /media/sf_SharedDir/FOOBAR
ap 3397 3396 0 16:08 pts/16 00:00:00 /bin/sh /media/sf_SharedDir/FOOBAR
I was surprised to see two lines of and they have differend PIDs so are those two processes running? Is there something wrong with my Popen call?
FOOBAR Script:
#!/bin/bash
while :
do
echo "still alive"
sleep 1
done
EDIT: Starting the script in a terminal ps displayes only one process.
Started via ./FOOBAR
ap#VBU:/media/sf_SharedDir$ ps -ef | grep -v grep | grep FOOBAR
ap 4115 3463 0 16:34 pts/5 00:00:00 /bin/bash ./FOOBAR
EDIT: shell=True is causing this issue (if it is one). But how would I fix that if I required shell to be True to run bash commands?

There is nothing wrong, what you see is perfectly normal. There is no "fix".
Each of your processes has a distinct function. The top-level process is running the python interpreter.
The second process, /bin/sh -c /media/sf_SharedDir/FOOBAR' is the shell that interprets the cmd line (because you want | or * or $HOME to be interpreted, you specified shell=True).
The third process, /bin/sh /media/sf_SharedDir/FOOBAR is the FOOBAR cmd. The /bin/sh comes from the #! line inside your FOOBAR program. If it were a C program, you'd just see /media/sf_SharedDir/FOOBAR here. If it were a python program, you'd see /usr/bin/python/media/sf_SharedDir/FOOBAR.
If you are really bothered by the second process, you could modify your python program like so:
command = "exec /media/sf_SharedDir/FOOBAR"
subprocess.Popen(command, shell=True)

How can I know if my python script is running? (using Cygwin or Windows shell)

I have a python script named sudoserver.py that I start in a CygWin shell by doing:
python sudoserver.py
I am planning to create a shell script (I don't know yet if I will use Windows shell script or a CygWin script) that needs to know if this sudoserver.py python script is running.
But if I do in CygWin (while sudoserver.py is running):
$ ps -e | grep "python" -i
11020 10112 11020 7160 cons0 1000 00:09:53 /usr/bin/python2.7
and in Windows shell:
C:\>tasklist | find "python" /i
python2.7.exe 4344 Console 1 13.172 KB
So it seems I have no info about the .py file being executed. All I know is that python is running something.
The -l (long) option for 'ps' on CygWin does not find my .py file. Nor does it the /v (verbose) switch at tasklist.
What should be the appropriate shell (Windows or CygWin shell would enough; both if possible would be fine) way to programmatically find if an specific python script is executing right now?
NOTE: The python process could be started by another user. Even from a user not logged in a GUI shell, and, even more, the "SYSTEM" (privileged) Windows user.

It is a limitation of the platform.
You probably need to use some low level API to retrieve the process info. You can take a look at this one: Getting the command line arguments of another process in Windows
You can probably use win32api module to access these APIs.
(Sorry, away from a Windows PC so I can't try it out)

Since sudoserver.py is your script, you could modify it to create a file in an accessible location when it starts and to delete the file when it finishes. Your shell script can then check for the existence of that file to find out if sudoserver.py is running.
(EDIT)
Thanks to the commenters who suggested that while the presence or absence of the file is an unreliable indicator, a file's lock status is not.
I wrote the following Python script testlock.py:
f = open ("lockfile.lck","w")
for i in range(10000000):
print (i)
f.close()
... and ran it in a Cygwin console window on my Windows PC. At the same time, I had another Cygwin console window open in the same directory.
First, after I started testlock.py:
Simon#Simon-PC ~/test/python
$ ls
lockfile.lck testlock.py
Simon#Simon-PC ~/test/python
$ rm lockfile.lck
rm: cannot remove `lockfile.lck': Device or resource busy
... then after I had shut down testlock.py by using Ctrl-C:
Simon#Simon-PC ~/test/python
$ rm lockfile.lck
Simon#Simon-PC ~/test/python
$ ls
testlock.py
Simon#Simon-PC ~/test/python
$
Thus, it appears that Windows is locking the file while the testlock.py script is running but it is unlocked when it is stopped with Ctrl-C. The equivalent test can be carried out in Python with the following script:
import os
try:
os.remove ("lockfile.lck")
except:
print ("lockfile.lck in use")
... which correctly reports:
$ python testaccess.py
lockfile.lck in use
... when testlock.py is running but successfully removes the locked file when testlock.py has been stopped with a Ctrl-C.
Note that this approach works in Windows but it won't work in Unix because, according to the Python documentation:
On Windows, attempting to remove a file that is in use causes
an exception to be raised; on Unix, the directory entry is removed
but the storage allocated to the file is not made available until
the original file is no longer in use.
A platform-independent solution using an additional Python module FileLock is described in Locking a file in Python.
(FURTHER EDIT)
It appears that the OP didn't necessarily want a solution in Python. An alternative would be to do this in bash. Here is testlock.sh:
#!/bin/bash
flock lockfile.lck sequence.sh
The script sequence.sh just runs a time-consuming operation:
#!/bin/bash
for i in `seq 1 1000000`;
do
echo $i
done
Now, while testlock.sh is running, we can test the lock status using another variant on flock:
$ flock -n lockfile.lck echo "Lock acquired" || echo "Could not acquire lock"
Could not acquire lock
$ flock -n lockfile.lck echo "Lock acquired" || echo "Could not acquire lock"
Could not acquire lock
$ flock -n lockfile.lck echo "Lock acquired" || echo "Could not acquire lock"
Lock acquired
$
The first two attempts to lock the file failed because testlock.sh was still running and so the file was locked. The last attempt succeeded because testlock.sh had finished running.

Background "atop" process from python script executed remotely with Fabric

Context
I'm adding a few pieces to an existing, working system.
There is a control machine (a local Linux PC) running some test scripts which involve sending lots of commands to several different machines remotely via SSH. The test framework is written in Python and uses Fabric to access the different machines.
All commands are handled with a generic calling function, simplified below:
def cmd(host, cmd, args):
...
with fabric.api.settings(host_string=..., user='root', use_ssh_config=True, disable_known_hosts=True):
return fabric.api.run('%s %s' % (cmd, args))
The actual commands sent to each machine usually involve running an existing python script on the remote side. Those python scripts, do some jobs which include invoking external commands (using system and subprocess). The run() command called on the test PC will return when the remote python script is done.
At one point I needed one of those remote python scripts to launch a background task: starting an openvon server and client using openvpn --config /path/to/config.openvpn. In a normal python script I would just use &:
system('openvpn --config /path/to/config.openvpn > /var/log/openvpn.log 2>&1 &')
When this script is called remotely via Fabric, one must explicitly use nohup, dtach, screen and the likes to run the job in background. I got it working with:
system("nohup openvpn --config /path/to/config.openvpn > /var/log/openvpn.log 2>&1 < /dev/null &"
The Fabric FAQ goes into some details about this.
It works fine for certain background commands.
Problem: doesn't work for all types of background commands
This technique doesn't work for all the commands I need. In some scripts, I need to launch a background atop command (it's a top on steroids) and redirect its stdout to a file.
My code (note: using atop -P for parseable output):
system('nohup atop -P%s 1 < /dev/null | grep %s > %s 2>&1 &' % (dataset, grep_options, filename))
When the script containing that command is called remotely via Fabric, the atop process is immediately killed. The output file is generated but it's empty. Calling the same script while logged in the remote machine by SSH works fine, the atop command dumps data periodically in my output file.
Some googling and digging around brought me to interesting information about background jobs using Fabric, but my problem seems to be only specific to certains types of background jobs. I've tried:
appending sleep
running with pty=False
replacing nohup with dtach -n: same symptoms
I read about commands like top failing in Fabric with stdin redirected to /dev/null, not quite sure what to make of it. I played around with different combinations or (non-) redirects of STDIN, STDOUT and STDERR
Looks like I'm running out of ideas.
Fabric seems overkill for what we are doing. We don't even use the "fabfile" method because it's integrated in a nose framework and I run them invoking nosetests. Maybe I should resort to dropping Fabric in favor of manual SSH commands, although I don't like the idea of changing a working system because of it not supporting one of my newer modules.

In my environment, looks like it is working
from fabric.api import sudo
def atop():
sudo('nohup atop -Pcpu 1 </dev/null '
'| grep cpu > /tmp/log --line-buffered 2>&1 &',
pty=False)
result:
fabric:~$ fab atop -H web01
>>>[web01] Executing task 'atop'
>>>[web01] sudo: nohup atop -Pcpu 1 </dev/null | grep cpu > /tmp/log --line-buffered 2>&1 &
>>>
>>>Done.
web01:~$ cat /tmp/log
>>>cpu web01 1374246222 2013/07/20 00:03:42 361905 100 0 5486 6968 0 9344927 3146 0 302 555 0 2494 100
>>>cpu web01 1374246223 2013/07/20 00:03:43 1 100 0 1 0 0 99 0 0 0 0 0 2494 100
>>>cpu web01 1374246224 2013/07/20 00:03:44 1 100 0 1 0 0 99 0 0 0 0 0 2494 100
...
The atop command may need the super user. This doesn't work
from fabric.api import run
def atop():
run('nohup atop -Pcpu 1 </dev/null '
'| grep cpu > /tmp/log --line-buffered 2>&1 &',
pty=False)
On the other hand this work.
from fabric.api import run
def atop():
run('sudo nohup atop -Pcpu 1 </dev/null '
'| grep cpu > /tmp/log --line-buffered 2>&1 &',
pty=False)

python shell command - why won't it work?

I wonder if anyone has any insights into this. I have a bash script that should put my ssh key onto a remote machine. Adopted from here, the script reads,
#!/usr/bin/sh
REMOTEHOST=user#remote
KEY="$HOME/.ssh/id_rsa.pub"
KEYCODE=`cat $KEY`
ssh -q $REMOTEHOST "mkdir ~/.ssh 2>/dev/null; chmod 700 ~/.ssh; echo "$KEYCODE" >> ~/.ssh/authorized_keys; chmod 644 ~/.ssh/authorized_keys"
This works. The equivalent python script should be
#!/usr/bin/python
import os
os.system('ssh -q %(REMOTEHOST)s "mkdir ~/.ssh 2>/dev/null; chmod 700 ~/.ssh; echo "%(KEYCODE)s" >> ~/.ssh/authorized_keys; chmod 644 ~/.ssh/authorized_keys"' %
{'REMOTEHOST':'user#remote',
'KEYCODE':open(os.path.join(os.environ['HOME'],
'.ssh/id_rsa.pub'),'r').read()})
But in this case, I get that
sh: line 1: >> ~/.ssh/authorized_keys; chmod 644 ~/.ssh/authorized_keys: No
such file or directory
What am I doing wrong? I tried escaping the inner-most quotes but same error message... Thank you in advance for your responses.

You have a serious question -- in that os.system isn't behaving the way you expect it to -- but also, you should seriously rethink the approach as a whole.
You're launching a Python interpreter -- but then, via os.system, telling that Python interpreter to launch a shell! os.system shouldn't be used at all in modern Python (subprocess is a complete replacement)... but using any Python call which starts a shell instance is exceptionally silly in this kind of use case.
Now, in terms of the actual, immediate problem -- look at how your quotation marks are nesting. You'll see that the quote you're starting before mkdir is being closed in the echo, allowing your command to be split in a spot you don't intend.
The following fixes this immediate issue, but is still awful and evil (starts a subshell unnecessarily, doesn't properly check output status, and should be converted to use subprocess.Popen()):
os.system('''ssh -q %(REMOTEHOST)s "mkdir ~/.ssh 2>/dev/null; chmod 700 ~/.ssh; echo '%(KEYCODE)s' >> ~/.ssh/authorized_keys; chmod 644 ~/.ssh/authorized_keys"''' % {
'REMOTEHOST':'user#remote',
'KEYCODE':open(os.path.join(os.environ['HOME'], '.ssh/id_rsa.pub'),'r').read()
})

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.