I have a python code in which I am calling a shell command. The part of the code where I did the shell command is:
try:
def parse(text_list):
text = '\n'.join(text_list)
cwd = os.getcwd()
os.chdir("/var/www/html/alenza/hdfs/user/alenza/sree_account/sree_project/src/core/data_analysis/syntaxnet/models/syntaxnet")
synnet_output = subprocess.check_output(["echo '%s' | syntaxnet/demo.sh 2>/dev/null"%text], shell = True)
os.chdir(cwd)
return synnet_output
except Exception as e:
sys.stdout.write(str(e))
Now, when i run this code on a local file with some sample input (I did cat /home/sree/example.json | python parse.py) it works fine and I get the required output. But I am trying to run the code with an input on my HDFS (the same cat command but input file path is from HDFS) which contains exactly the same type of json entries and it fails with an error:
/bin/sh: line 62: to: command not found
list index out of range
I read similar questions on Stack Overflow and the solution was to include a Shebang line for the shell script that is being called. I do have the shebang line #!/usr/bin/bash in demo.sh script.
Also, which bash gives /usr/bin/bash.
Someone please elaborate.
You rarely, if ever, want to combine passing a list argument with shell=True. Just pass the string:
synnet_output = subprocess.check_output("echo '%s' | syntaxnet/demo.sh 2>/dev/null"%(text,), shell=True)
However, you don't really need a shell pipeline here.
from subprocess import check_output
from StringIO import StringIO # from io import StringIO in Python 3
synnet_output = check_output(["syntaxnet/demo.sh"],
stdin=StringIO(text),
stderr=os.devnull)
There was a problem with some special characters appearing in the text string that i was inputting to demo.sh. I solved this by storing text into a temporary file and sending the contents of that file to demo.sh.
That is:
try:
def parse(text_list):
text = '\n'.join(text_list)
cwd = os.getcwd()
with open('/tmp/data', 'w') as f:
f.write(text)
os.chdir("/var/www/html/alenza/hdfs/user/alenza/sree_account/sree_project/src/core/data_analysis/syntaxnet/models/syntaxnet")
synnet_output = subprocess.check_output(["cat /tmp/data | syntaxnet/demo.sh 2>/dev/null"%text], shell = True)
os.chdir(cwd)
return synnet_output
except Exception as e:
sys.stdout.write(str(e))
Related
Iam trying to execute an exif command using subprocess. The command is :
['exiftool', '-ID3:Picture', '-b', '-ThumbnailImage', '/home/mediaworker/Downloads/Raabta.mp3', '>', '/mnt/share_PROXY/exifData/Raabta.jpg']
Now, the issue is that it returns the status code as 1. But if i execute the same command in the terminal, it executes successfully. The file is written to the location. Is my command going wrong in subprocess ? The error i get when i run my python script is :
Error: File not found - >
Error: File not found - /mnt/share_PROXY/exifData/Raabta.jpg
The code implementation is as follows:
file_name = os.path.basename(file_loc)
file_name = file_name.replace(os.path.splitext(file_name)[1], ".jpg")
dst_loc = os.path.join(dst_loc, file_name)
cmd_ = ["exiftool", "-ID3:Picture", "-b", "-ThumbnailImage", file_loc, ">", dst_loc]
logger.info("Command is {}".format(cmd_))
try:
p = subprocess.Popen(cmd_, stdout=subprocess.PIPE)
p.communicate()
if p.returncode != 0:
logger.error("Failed to write thumbnail artwork")
else:
id3_metadata.append({"file_thumbnail_info_path": dst_loc})
except Exception:
logger.error("[extract_iptc_metadata] Exception : '{}'".format(ex))
The error output refers to the redirection >.
The proper way to redirect using subprocess is using the stdout parameter.
with open(dst_loc, 'wb') as f:
p = subprocess.Popen(cmd_, stdout=f)
p.communicate()
The '>', '/mnt/share_PROXY/exifData/Raabta.jpg' part of your command is shell redirection and is a function of the command line/shell. It is not available when you execute a command from python in this way.
The option you want to look at is the -W (-tagOut) option. This would be the example command you want to work off of. Just replace -preview:all with the tag you want to extract, which would be -ThumbnailImage in this case.
I am having this problem where I can print out the powershell code output with the print() function, but when I try to do the same, except this time I write the output to a file, the only thing that is written in the file is "0", why would the printed output be different from when I write the same exact code, except that I this time "print" it to a text file.
I want the text file to contain exactly what the print function prints to the terminal, why isn't it working, and how can I get it to work??
Here are some pictures and the code:
import os
import time
def monitorprocess(process):
run = True
time_q = float(input("How many minutes before each check? "))
while run:
timespan = os.system(f'powershell New-TimeSpan -Start(Get-process {process}).StartTime')
try:
open(f'powershellpython\{process}.txt','x')
except:
pass
with open(f'powershellpython\{process}.txt',"w") as file:
file.write(str(timespan))
print(timespan)
time.sleep(time_q*60)
def processes():
process = input("What is the name of your process, if you are unsure, type 'get-process', and if you want to use ID (this works with multiple processes with the same name) type ID: \n")
if process == "get-process":
print(os.system("powershell get-process"))
process = input("What is the name of your process, if you are unsure, type 'get-process', and find your process: \n")
else:
monitorprocess(process)
processes()
And there is some more output with the print, that being "hours" and "days", but that does not really matter in this context.
I can't test it with powershell because I don't use Windows but to catch output you should use other methods in subprocess
ie. subprocess.check_output()
import subprocess
output = subprocess.check_output(cmd, shell=True)
with open('output.txt', 'w') as file:
file.write(output.decode())
ie. subprocess.run()
import subprocess
from subprocess import PIPE
output = subprocess.run(cmd, shell=True, stdout=PIPE).stdout
with open('output.txt', 'w') as file:
file.write(output.decode())
Probably you could even redirect run() directly to file using stdout=
with open('output.txt', 'w') as file:
subprocess.run(cmd, shell=True, stdout=file)
Using os.system() you can catch only return code (error code) and you could only do python script.py > output.txt to get text in file output.txt
What you see on screen can be produced by PowerShell.
Try
timespan = os.system(f'powershell New-TimeSpan -Start(Get-process {process}).StartTime | Format-List | Out-String')
This now will not return a TimeSpan object, but rather a multiline string meant to display the properties of the object on screen.
I have problem with my python cmd script.
I don't know why it does not work. Maybe something wrong with my code.
Im trying to run the program in cmdline through my python script.
And Im getting error in bash "sh: 1: Syntax error: redirection unexpected"
pls help Im just biologist :)
Im using spyder (anaconda)/Ubuntu
#!/usr/bin/python
import sys
import os
input_ = sys.argv[1]
output_file = open(sys.argv[2],'a+')
names = input_.rsplit('.')
for name in names:
os.system("esearch -db pubmed -query %s | efetch -format xml | xtract -pattern PubmedArticle -element AbstractText >> %s" % (name, output_file))
print("------------------------------------------")
output_file is a file object. When you do "%s" % output_file, the resulting string is something like "<open file 'filename', mode 'a+' at 0x7f1234567890>". This means that the os.system call is running a command like
command... >> <open file 'filename', mode 'a+' at 0x7f1234567890>
The < after the >> causes the "Syntax error: redirection unexpected" error message.
To fix that, don't open the output file in your Python script, just use the filename:
output_file = sys.argv[2]
I got similar error on following line:
os.system('logger Status changed on %s' %s repr(datetime.now())
Indeed, as nomadictype stated the problem is in running plain OS command. The command may include special characters. In my case this was <.
So instead of changing OS command significantly, I just added quotes and this works:
os.system('logger "Status changed on %s"' %s repr(datetime.now())
Quotes make content of passed parameter invisible for shell.
I am trying to write a script where I pass file name as argument from shell script to python script and python script processes that script.It is giving me keyerror but if I run the same script hardcoding the file name it works fine.
#!/bin/sh
LOCKFILE=./test.txt
if [ -e ${LOCKFILE} ] && kill -0 `cat ${LOCKFILE}`; then
echo "already running"
exit
fi
trap "rm -f ${LOCKFILE}; exit" INT TERM EXIT
echo $$ > ${LOCKFILE}
# do stuff
FILES=/home/sugoi/script/csv/*
for file in $FILES
do
python ./csvTest.py $file
#mv $file ./archive
done
rm -f ${LOCKFILE}
exit
Python:
from pymongo import MongoClient
import csv
import json
import sys
client = MongoClient()
db = client.test
for arg in sys.argv:
try:
csvfile = open(arg, 'r')#if i hardcode file name here it works fine
except IOError as e:
#write to error log
sys.exit(100)
reader = csv.DictReader(csvfile)
header=reader.next()
for each in reader:
row={}
for field in header:
row[field]=each[field]
db.test.update({"_id": row["CustomerId"]}, {"$push": {"activities":{"action": row["Action"],"date" :row["Timestamp"],"productId":row["productId"]}}},True)
What am I doing wrong ?
Two issues.
Your shell script isn't expanding the file list correctly.
FILES=/home/sugoi/script/csv/* needs to be something like:
FILES=`ls -1 /home/sugoi/script/csv/*;`
Your argument to the python script will only be one file at a time, so why loop through sys.argv?
Just use the argument itself, sys.argv[1]. As #Brian Besmanoff pointed out, that needs to be indexed 1 because the script name itself is stored in sys.argv[0].
try:
csvfile = open(sys.argv[1], 'r')
except IOError as e:
(...)
Finally: you can just parse directories with Python instead of looping in a shell script. Look at the os module, particularly os.listdir(). A little more work and you can have the whole thing running inside one Python script instead of juggling between shell and calling a script.
The first value in sys.argv is going to be the name of the script. reference
I write lots of small scripts to manipulate files on a Bash-based server. I would like to have a mechanism by which to log which commands created which files in a given directory. However, I don't just want to capture every input command, all the time.
Approach 1: a wrapper script that uses a Bash builtin (a la history or fc -ln -1) to grab the last command and write it to a log file. I have not been able to figure out any way to do this, as the shell builtin commands do not appear to be recognized outside of the interactive shell.
Approach 2: a wrapper script that pulls from ~/.bash_history to get the last command. This, however, requires setting up the Bash shell to flush every command to history immediately (as per this comment) and seems also to require that the history be allowed to grow inexorably. If this is the only way, so be it, but it would be great to avoid having to edit the ~/.bashrc file on every system where this might be implemented.
Approach 3: use script. My problem with this is that it requires multiple commands to start and stop the logging, and because it launches its own shell it is not callable from within another script (or at least, doing so complicates things significantly).
I am trying to figure out an implementation that's of the form log_this.script other_script other_arg1 other_arg2 > file, where everything after the first argument is logged. The emphasis here is on efficiency and minimizing syntax overhead.
EDIT: iLoveTux and I both came up with similar solutions. For those interested, my own implementation follows. It is somewhat more constrained in its functionality than the accepted answer, but it also auto-updates any existing logfile entries with changes (though not deletions).
Sample usage:
$ cmdlog.py "python3 test_script.py > test_file.txt"
creates a log file in the parent directory of the output file with the following:
2015-10-12#10:47:09 test_file.txt "python3 test_script.py > test_file.txt"
Additional file changes are added to the log;
$ cmdlog.py "python3 test_script.py > test_file_2.txt"
the log now contains
2015-10-12#10:47:09 test_file.txt "python3 test_script.py > test_file.txt"
2015-10-12#10:47:44 test_file_2.txt "python3 test_script.py > test_file_2.txt"
Running on the original file name again changes the file order in the log, based on modification time of the files:
$ cmdlog.py "python3 test_script.py > test_file.txt"
produces
2015-10-12#10:47:44 test_file_2.txt "python3 test_script.py > test_file_2.txt"
2015-10-12#10:48:01 test_file.txt "python3 test_script.py > test_file.txt"
Full script:
#!/usr/bin/env python3
'''
A wrapper script that will write the command-line
args associated with any files generated to a log
file in the directory where the files were made.
'''
import sys
import os
from os import listdir
from os.path import isfile, join
import subprocess
import time
from datetime import datetime
def listFiles(mypath):
"""
Return relative paths of all files in mypath
"""
return [join(mypath, f) for f in listdir(mypath) if
isfile(join(mypath, f))]
def read_log(log_file):
"""
Reads a file history log and returns a dictionary
of {filename: command} entries.
Expects tab-separated lines of [time, filename, command]
"""
entries = {}
with open(log_file) as log:
for l in log:
l = l.strip()
mod, name, cmd = l.split("\t")
# cmd = cmd.lstrip("\"").rstrip("\"")
entries[name] = [cmd, mod]
return entries
def time_sort(t, fmt):
"""
Turn a strftime-formatted string into a tuple
of time info
"""
parsed = datetime.strptime(t, fmt)
return parsed
ARGS = sys.argv[1]
ARG_LIST = ARGS.split()
# Guess where logfile should be put
if (">" or ">>") in ARG_LIST:
# Get position after redirect in arg list
redirect_index = max(ARG_LIST.index(e) for e in ARG_LIST if e in ">>")
output = ARG_LIST[redirect_index + 1]
output = os.path.abspath(output)
out_dir = os.path.dirname(output)
elif ("cp" or "mv") in ARG_LIST:
output = ARG_LIST[-1]
out_dir = os.path.dirname(output)
else:
out_dir = os.getcwd()
# Set logfile location within the inferred output directory
LOGFILE = out_dir + "/cmdlog_history.log"
# Get file list state prior to running
all_files = listFiles(out_dir)
pre_stats = [os.path.getmtime(f) for f in all_files]
# Run the desired external commands
subprocess.call(ARGS, shell=True)
# Get done time of external commands
TIME_FMT = "%Y-%m-%d#%H:%M:%S"
log_time = time.strftime(TIME_FMT)
# Get existing entries from logfile, if present
if LOGFILE in all_files:
logged = read_log(LOGFILE)
else:
logged = {}
# Get file list state after run is complete
post_stats = [os.path.getmtime(f) for f in all_files]
post_files = listFiles(out_dir)
# Find files whose states have changed since the external command
changed = [e[0] for e in zip(all_files, pre_stats, post_stats) if e[1] != e[2]]
new = [e for e in post_files if e not in all_files]
all_modded = list(set(changed + new))
if not all_modded: # exit early, no need to log
sys.exit(0)
# Replace files that have changed, add those that are new
for f in all_modded:
name = os.path.basename(f)
logged[name] = [ARGS, log_time]
# Write changed files to logfile
with open(LOGFILE, 'w') as log:
for name, info in sorted(logged.items(), key=lambda x: time_sort(x[1][1], TIME_FMT)):
cmd, mod_time = info
if not cmd.startswith("\""):
cmd = "\"{}\"".format(cmd)
log.write("\t".join([mod_time, name, cmd]) + "\n")
sys.exit(0)
You can use the tee command, which stores its standard input to a file and outputs it on standard output. Pipe the command line into tee, and pipe tee's output into a new invocation of your shell:
echo '<command line to be logged and executed>' | \
tee --append /path/to/your/logfile | \
$SHELL
i.e., for your example of other_script other_arg1 other_arg2 > file,
echo 'other_script other_arg1 other_arg2 > file' | \
tee --append /tmp/mylog.log | \
$SHELL
If your command line needs single quotes, they need to be escaped properly.
OK, so you don't mention Python in your question, but it is tagged Python, so I figured I would see what I could do. I came up with this script:
import sys
from os.path import expanduser, join
from subprocess import Popen, PIPE
def issue_command(command):
process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True)
return process.communicate()
home = expanduser("~")
log_file = join(home, "command_log")
command = sys.argv[1:]
with open(log_file, "a") as fout:
fout.write("{}\n".format(" ".join(command)))
out, err = issue_command(command)
which you can call like (if you name it log_this and make it executable):
$ log_this echo hello world
and it will put "echo hello world" in a file ~/command_log, note though that if you want to use pipes or redirection you have to quote your command (this may be a real downfall for your use case or it may not be, but I haven't figured out how to do this just yet without the quotes) like this:
$ log_this "echo hello world | grep h >> /tmp/hello_world"
but since it's not perfect, I thought I would add a little something extra.
The following script allows you to specify a different file to log your commands to as well as record the execution time of the command:
#!/usr/bin/env python
from subprocess import Popen, PIPE
import argparse
from os.path import expanduser, join
from time import time
def issue_command(command):
process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True)
return process.communicate()
home = expanduser("~")
default_file = join(home, "command_log")
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file", type=argparse.FileType("a"), default=default_file)
parser.add_argument("-p", "--profile", action="store_true")
parser.add_argument("command", nargs=argparse.REMAINDER)
args = parser.parse_args()
if args.profile:
start = time()
out, err = issue_command(args.command)
runtime = time() - start
entry = "{}\t{}\n".format(" ".join(args.command), runtime)
args.file.write(entry)
else:
out, err = issue_command(args.command)
entry = "{}\n".format(" ".join(args.command))
args.file.write(entry)
args.file.close()
You would use this the same way as the other script, but if you wanted to specify a different file to log to just pass -f <FILENAME> before your actual command and your log will go there, and if you wanted to record the execution time just provide the -p (for profile) before your actual command like so:
$ log_this -p -f ~/new_log "echo hello world | grep h >> /tmp/hello_world"
I will try to make this better, but if you can think of anything else this could do for you, I am making a github project for this where you can submit bug reports and feature requests.