Logging last Bash command to file from script - python

I write lots of small scripts to manipulate files on a Bash-based server. I would like to have a mechanism by which to log which commands created which files in a given directory. However, I don't just want to capture every input command, all the time.
Approach 1: a wrapper script that uses a Bash builtin (a la history or fc -ln -1) to grab the last command and write it to a log file. I have not been able to figure out any way to do this, as the shell builtin commands do not appear to be recognized outside of the interactive shell.
Approach 2: a wrapper script that pulls from ~/.bash_history to get the last command. This, however, requires setting up the Bash shell to flush every command to history immediately (as per this comment) and seems also to require that the history be allowed to grow inexorably. If this is the only way, so be it, but it would be great to avoid having to edit the ~/.bashrc file on every system where this might be implemented.
Approach 3: use script. My problem with this is that it requires multiple commands to start and stop the logging, and because it launches its own shell it is not callable from within another script (or at least, doing so complicates things significantly).
I am trying to figure out an implementation that's of the form log_this.script other_script other_arg1 other_arg2 > file, where everything after the first argument is logged. The emphasis here is on efficiency and minimizing syntax overhead.
EDIT: iLoveTux and I both came up with similar solutions. For those interested, my own implementation follows. It is somewhat more constrained in its functionality than the accepted answer, but it also auto-updates any existing logfile entries with changes (though not deletions).
Sample usage:
$ cmdlog.py "python3 test_script.py > test_file.txt"
creates a log file in the parent directory of the output file with the following:
2015-10-12#10:47:09 test_file.txt "python3 test_script.py > test_file.txt"
Additional file changes are added to the log;
$ cmdlog.py "python3 test_script.py > test_file_2.txt"
the log now contains
2015-10-12#10:47:09 test_file.txt "python3 test_script.py > test_file.txt"
2015-10-12#10:47:44 test_file_2.txt "python3 test_script.py > test_file_2.txt"
Running on the original file name again changes the file order in the log, based on modification time of the files:
$ cmdlog.py "python3 test_script.py > test_file.txt"
produces
2015-10-12#10:47:44 test_file_2.txt "python3 test_script.py > test_file_2.txt"
2015-10-12#10:48:01 test_file.txt "python3 test_script.py > test_file.txt"
Full script:
#!/usr/bin/env python3
'''
A wrapper script that will write the command-line
args associated with any files generated to a log
file in the directory where the files were made.
'''
import sys
import os
from os import listdir
from os.path import isfile, join
import subprocess
import time
from datetime import datetime
def listFiles(mypath):
"""
Return relative paths of all files in mypath
"""
return [join(mypath, f) for f in listdir(mypath) if
isfile(join(mypath, f))]
def read_log(log_file):
"""
Reads a file history log and returns a dictionary
of {filename: command} entries.
Expects tab-separated lines of [time, filename, command]
"""
entries = {}
with open(log_file) as log:
for l in log:
l = l.strip()
mod, name, cmd = l.split("\t")
# cmd = cmd.lstrip("\"").rstrip("\"")
entries[name] = [cmd, mod]
return entries
def time_sort(t, fmt):
"""
Turn a strftime-formatted string into a tuple
of time info
"""
parsed = datetime.strptime(t, fmt)
return parsed
ARGS = sys.argv[1]
ARG_LIST = ARGS.split()
# Guess where logfile should be put
if (">" or ">>") in ARG_LIST:
# Get position after redirect in arg list
redirect_index = max(ARG_LIST.index(e) for e in ARG_LIST if e in ">>")
output = ARG_LIST[redirect_index + 1]
output = os.path.abspath(output)
out_dir = os.path.dirname(output)
elif ("cp" or "mv") in ARG_LIST:
output = ARG_LIST[-1]
out_dir = os.path.dirname(output)
else:
out_dir = os.getcwd()
# Set logfile location within the inferred output directory
LOGFILE = out_dir + "/cmdlog_history.log"
# Get file list state prior to running
all_files = listFiles(out_dir)
pre_stats = [os.path.getmtime(f) for f in all_files]
# Run the desired external commands
subprocess.call(ARGS, shell=True)
# Get done time of external commands
TIME_FMT = "%Y-%m-%d#%H:%M:%S"
log_time = time.strftime(TIME_FMT)
# Get existing entries from logfile, if present
if LOGFILE in all_files:
logged = read_log(LOGFILE)
else:
logged = {}
# Get file list state after run is complete
post_stats = [os.path.getmtime(f) for f in all_files]
post_files = listFiles(out_dir)
# Find files whose states have changed since the external command
changed = [e[0] for e in zip(all_files, pre_stats, post_stats) if e[1] != e[2]]
new = [e for e in post_files if e not in all_files]
all_modded = list(set(changed + new))
if not all_modded: # exit early, no need to log
sys.exit(0)
# Replace files that have changed, add those that are new
for f in all_modded:
name = os.path.basename(f)
logged[name] = [ARGS, log_time]
# Write changed files to logfile
with open(LOGFILE, 'w') as log:
for name, info in sorted(logged.items(), key=lambda x: time_sort(x[1][1], TIME_FMT)):
cmd, mod_time = info
if not cmd.startswith("\""):
cmd = "\"{}\"".format(cmd)
log.write("\t".join([mod_time, name, cmd]) + "\n")
sys.exit(0)

You can use the tee command, which stores its standard input to a file and outputs it on standard output. Pipe the command line into tee, and pipe tee's output into a new invocation of your shell:
echo '<command line to be logged and executed>' | \
tee --append /path/to/your/logfile | \
$SHELL
i.e., for your example of other_script other_arg1 other_arg2 > file,
echo 'other_script other_arg1 other_arg2 > file' | \
tee --append /tmp/mylog.log | \
$SHELL
If your command line needs single quotes, they need to be escaped properly.

OK, so you don't mention Python in your question, but it is tagged Python, so I figured I would see what I could do. I came up with this script:
import sys
from os.path import expanduser, join
from subprocess import Popen, PIPE
def issue_command(command):
process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True)
return process.communicate()
home = expanduser("~")
log_file = join(home, "command_log")
command = sys.argv[1:]
with open(log_file, "a") as fout:
fout.write("{}\n".format(" ".join(command)))
out, err = issue_command(command)
which you can call like (if you name it log_this and make it executable):
$ log_this echo hello world
and it will put "echo hello world" in a file ~/command_log, note though that if you want to use pipes or redirection you have to quote your command (this may be a real downfall for your use case or it may not be, but I haven't figured out how to do this just yet without the quotes) like this:
$ log_this "echo hello world | grep h >> /tmp/hello_world"
but since it's not perfect, I thought I would add a little something extra.
The following script allows you to specify a different file to log your commands to as well as record the execution time of the command:
#!/usr/bin/env python
from subprocess import Popen, PIPE
import argparse
from os.path import expanduser, join
from time import time
def issue_command(command):
process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True)
return process.communicate()
home = expanduser("~")
default_file = join(home, "command_log")
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file", type=argparse.FileType("a"), default=default_file)
parser.add_argument("-p", "--profile", action="store_true")
parser.add_argument("command", nargs=argparse.REMAINDER)
args = parser.parse_args()
if args.profile:
start = time()
out, err = issue_command(args.command)
runtime = time() - start
entry = "{}\t{}\n".format(" ".join(args.command), runtime)
args.file.write(entry)
else:
out, err = issue_command(args.command)
entry = "{}\n".format(" ".join(args.command))
args.file.write(entry)
args.file.close()
You would use this the same way as the other script, but if you wanted to specify a different file to log to just pass -f <FILENAME> before your actual command and your log will go there, and if you wanted to record the execution time just provide the -p (for profile) before your actual command like so:
$ log_this -p -f ~/new_log "echo hello world | grep h >> /tmp/hello_world"
I will try to make this better, but if you can think of anything else this could do for you, I am making a github project for this where you can submit bug reports and feature requests.

Related

Python safeiy capture live output from multiple subprocesses

It is explained in https://stackoverflow.com/a/18422264/7238575 how one can run a subprocess and read out the results live. However, it looks like it creates a file with a name test.log to do so. This makes me worry that if multiple scripts are using this trick in the same directory the test.log file might well be corrupted. Is there a way that does not require a file to be created outside Python? Or can we make sure that each process uses a unique log file? Or am I completely misunderstanding the situation and is there no risk of simultaneous writes by different programs to the same test.log file?
You don't need to write the live output to a file. You can write it to simply to STDOUT with sys.stdout.write("your message").
On the other hand you can generate unique log files for each process:
import os
import psutil
pid = psutil.Process(os.getpid())
process_name = pid.name()
path, extension = os.path.splitext(os.path.join(os.getcwd(), "my_basic_log_file.log"))
created_log_file_name = "{0}_{1}{2}".format(path, process_name, extension)
print(created_log_file_name)
Output:
>>> python3 test_1.py
/home/my_user/test_folder/my_basic_log_file_python3.log
If you see the above example my process name was python3 so this process name was inserted to the "basic" log file name. With this solution you can create unique log files for your processes.
You can set your process name with the setproctitle.setproctitle("my_process_name").
Here is an example.
import os
import psutil
import setproctitle
setproctitle.setproctitle("milan_balazs")
pid = psutil.Process(os.getpid())
process_name = pid.name()
path, extension = os.path.splitext(os.path.join(os.getcwd(), "my_basic_log_file.log"))
created_log_file_name = "{0}_{1}{2}".format(path, process_name, extension)
print(created_log_file_name)
Output:
>>> python3 test_1.py
/home/my_user/test_folder/my_basic_log_file_milan_balazs.log
Previously I have written a quite complex and safe command caller which can make live output (not to file). You can check it:
import sys
import os
import subprocess
import select
import errno
def poll_command(process, realtime):
"""
Watch for error or output from the process
:param process: the process, running the command
:param realtime: flag if realtime logging is needed
:return: Return STDOUT and return code of the command processed
"""
coutput = ""
poller = select.poll()
poller.register(process.stdout, select.POLLIN)
fdhup = {process.stdout.fileno(): 0}
while sum(fdhup.values()) < len(fdhup):
try:
r = poller.poll(1)
except select.error as err:
if not err.args[0] == errno.EINTR:
raise
r = []
for fd, flags in r:
if flags & (select.POLLIN | select.POLLPRI):
c = version_conversion(fd, realtime)
coutput += c
else:
fdhup[fd] = 1
return coutput.strip(), process.poll()
def version_conversion(fd, realtime):
"""
There are some differences between Python2/3 so this conversion is needed.
"""
c = os.read(fd, 4096)
if sys.version_info >= (3, 0):
c = c.decode("ISO-8859-1")
if realtime:
sys.stdout.write(c)
sys.stdout.flush()
return c
def exec_shell(command, real_time_out=False):
"""
Call commands.
:param command: Command line.
:param real_time_out: If this variable is True, the output of command is logging in real-time
:return: Return STDOUT and return code of the command processed.
"""
if not command:
print("Command is not available.")
return None, None
print("Executing '{}'".format(command))
rtoutput = real_time_out
p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
out, return_code = poll_command(p, rtoutput)
if p.poll():
error_msg = "Return code: {ret_code} Error message: {err_msg}".format(
ret_code=return_code, err_msg=out
)
print(error_msg)
print("[OK] - The command calling was successful. CMD: '{}'".format(command))
return out, return_code
exec_shell("echo test running", real_time_out=True)
Output:
>>> python3 test.py
Executing 'echo test running'
test running
[OK] - The command calling was successful. CMD: 'echo test running'
I hope my answer answers your question! :)

Run multiple bash lines in Python and separetely check their status and output

I am trying to execute several lines of bash in Python 3 and check the status of each line separately.
I first tried to use gestatusoutput from subprocess, but each line is run in a separated process that does not communicate with the others (for the sake of simplicity, the given MWE consists of setting a variable, but what I intend to do in my actual code is more complex than that — and I know about os.environ for this very specific example):
from subprocess import getstatusoutput as cmd
stat, out = cmd("export TEST=1")
stat, out = cmd("echo $TEST")
will therefore returns:
>>> print(out)
(0, "")
I then tried the following:
cmdline = """export TEST=1
echo $TEST"""
stat, out = cmd(cmdline)
That works but forces me to parse the output, specially if I want to check the status of the first command (if echo works, the status returns by cmd is 0 whatever happens before), that is not very robust.
I saw some things using Popen (still from subprocess) but was unable to use it efficiently.
Any help would be appreciated!
To me, you are trying to share the environment variable between two process, which is not possible.
It looks like this:
Process 1 python main.py #TEST = ""
|Process 2-->"export TEST=1" #Change Process2 env variable TEST to '1'
|Process 3-->"echo $TEST" #Print Process3 env variable TEST (get from process 1)
You can use os.environ[] to change the current environment first (Process 1 variable),Later on use the variable after fork.
Something like this
import os
import subprocess
import sys
os.environ['TEST'] = '1'
out = subprocess.check_call('echo $TEST',shell = True)
I resulted doing the following:
create a launch command wrapping subprocess.Popen to launch my bash commands, that in addition allows me either to retrieve the current environment or to pass a custom environment
create a get_env to parse the return from the previous command and get a dict of the environment
launch wrapper
import os
import subprocess as sp
def launch(cmd_, env=os.environ, get_env=False):
if get_env: cmd_ += " && printenv"
load = sp.Popen(cmd_, shell=True, stdout=sp.PIPE, stderr=sp.PIPE, env=env)
out = load.communicate()
err = load.returncode
return(err, out)
Retrieve the environment
def get_env(out, encoding='utf-8'):
lout = str(out[0], encoding).split('\n')
new_env = {}
for line in lout:
if len(line.split('=')) <= 1:
pass
else:
k = line.split("=")[0]
v = "=".join(line.split("=")[1:])
new_env[k] = v
return new_env
(This is a simple version, it may be more complicated if you have things like functions in your environment — it happens.)
Results:
I can use it as follow:
err, out = launch("export TEST=1", get_env=True)
if not err: new_env = get_env(out)
err, out = launch("echo $TEST", env=new_env)
and therefore:
>>> print(str(out[0], encoding='utf-8'))
1

Stream stdout from subprocess to python function, and back to subprocess

I am trying to use python in a unix style pipe.
For example, in unix I can use a pipe such as:
$ samtools view -h somefile.bam | python modifyStdout.py | samtools view -bh - > processed.bam
I can do this by using a for line in sys.stdin: loop in the python script and that appears to work without problems.
However I would like to internalise this unix command into a python script. The files involved will be large so I would like to avoid blocking behaviour, and basically stream between processes.
At the moment I am trying to use Popen to manage each command, and pass the stdout of the first process to the stdin of the next process, and so on.
In a seperate python script I have (sep_process.py):
import sys
f = open("sentlines.txt", 'wr')
f.write("hi")
for line in sys.stdin:
print line
f.write(line)
f.close()
And in my main python script I have this:
import sys
from subprocess import Popen, PIPE
# Generate an example file to use
f = open('sees.txt', 'w')
f.write('somewhere over the\nrainbow')
f.close()
if __name__ == "__main__":
# Use grep as an example command
p1 = Popen("grep over sees.txt".split(), stdout=PIPE)
# Send to sep_process.py
p2 = Popen("python ~/Documents/Pythonstuff/Bam_count_tags/sep_process.py".split(), stdin=p1.stdout, stdout=PIPE)
# Send to final command
p3 = Popen("wc", stdin=p2.stdout, stdout=PIPE)
# Read output from wc
result = p3.stdout.read()
print result
The p2 process however fails [Errno 2] No such file or directory even though the file exists.
Do I need to implement a Queue of some kind and/or open the python function using the multiprocessing module?
The tilde ~ is a shell expansion. You are not using a shell, so it is looking for a directory called ~.
You could read the environment variable HOME and insert that. Use
os.environ['HOME']
Alternatively you could use shell=True if you can't be bothered to do your own expansion.
Thanks #cdarke, that solved the problem for using simple commands like grep, wc etc. However I was too stupid to get subprocess.Popen to work when using an executable such as samtools to provide the data stream.
To fix the issue, I created a string containing the pipe exactly as I would write it in the command line, for example:
sam = '/Users/me/Documents/Tools/samtools-1.2/samtools'
home = os.environ['HOME']
inpath = "{}/Documents/Pythonstuff/Bam_count_tags".format(home)
stream_in = "{s} view -h {ip}/test.bam".format(s=sam, ip=inpath)
pyscript = "python {ip}/bam_tags.py".format(ip=inpath)
stream_out = "{s} view -bh - > {ip}/small.bam".format(s=sam, ip=inpath)
# Absolute paths, witten as a pipe
fullPipe = "{inS} | {py} | {outS}".format(inS=stream_in,
py=pyscript,
outS=stream_out)
print fullPipe
# Translates to >>>
# samtools view -h test.bam | python ./bam_tags.py | samtools view -bh - > small.bam
I then used popen from the os module instead and this worked as expected:
os.popen(fullPipe)

pass python var to bash

I'm making a script to take pictures and write them to a folder created/named with the "data&time"
I made this part to create the directory and take the pictures
pathtoscript = "/home/pi/python-scripts"
current_time = time.localtime()[0:6]
dirfmt = "%4d-%02d-%02d-%02d-%02d-%02d"
dirpath = os.path.join(pathtoscript , dirfmt)
dirname = dirpath % current_time[0:6] #dirname created with date and time
os.mkdir(dirname) #mkdir
pictureName = dirname + "/image%02d.jpg" #path+name of pictures
camera.capture_sequence([pictureName % i for i in range(9)])
Then I would like to pass the dirname to a bash script (picturesToServer) which uploads the pictures to a server.
How can I do it?
cmd = '/home/pi/python-scripts/picturesToServer >/dev/null 2>&1 &'
call ([cmd], shell=True)
Maybe I could stay in the python script scp the pictures to the server? I have a ssh-agent with the paraphrase set (ssh-add mykey).
Place the variable in the environment (it'll be available as a regular bash variable in the bash script, e.g. as VAR_NAME in the example below) by replacing your call with:
import subprocess
p = subprocess.Popen(cmd, shell=True, env={"VAR_NAME": dirname})
Or pass it as a positional argument (it'll be available in $1 in the script) by replacing your cmd with:
cmd = '/home/pi/python-scripts/picturesToServer >/dev/null 2>&1 "{0}" &'.format(dirname)
As a side note, consider not using shell = True when you call a subprocess. Using shell = True is a bad idea for a lot of reasons that are documented in the Python docs

Malformed environment variables detection in python

I am trying to source a bash script containing some environment variables in python. I followed one other thread to do it. But, there seems that one of the variable is malformed, as can be seen in the given snippet.
COLORTERM=gnome-terminal
mc=() { . /usr/share/mc/mc-wrapper.sh
}
_=/usr/bin/env
I am using the following code to set up the current environment.
import os
import pprint
import subprocess
command = ['bash', '-c', 'source init_env && env']
proc = subprocess.Popen(command, stdout = subprocess.PIPE)
for line in proc.stdout:
(key, _, value) = line.partition("=")
os.environ[key] = value
proc.communicate()
'
If I change the above code a little like putting a condition:
for line in proc.stdout:
(key, _, value) = line.partition("=")
if not value:
continue
os.environ[key] = value
then things are working but the environment is corrupted because of one missing bracket as can be seen from the snippet of environment variable that the bracket is appearing on new line. Because of this corruption, If I run some other command like
os.system("ls -l")
it gives me the following error
sh: mc: line 1: syntax error: unexpected end of file
sh: error importing function definition for `mc'
What could be the possible solutions for this problem?
Thanks alot
Probably the best way to do this is to create a separate program that writes out the environment variables in a way that is easily and unambiguously processed by your own program; then call that program instead of env. Using the standard pickle module, that separate program can be as simple as this:
import os
import sys
import pickle
pickle.dump(os.environ, sys.stdout)
which you can either save into its own .py file, or else put directly in a Bash command:
python -c 'import os, sys, pickle; pickle.dump(os.environ, sys.stdout)'
In either case, you can process its output like this:
import os
import pprint
import subprocess
import pickle
command = [
'bash',
'-c',
'source init_env && ' +
'python -c "import os, sys, pickle; ' +
'pickle.dump(os.environ, sys.stdout)"'
]
proc = subprocess.Popen(command, stdout = subprocess.PIPE)
for k, v in pickle.load(proc.stdout).iteritems():
os.environ[k] = v
proc.communicate()

Categories