Calling a perl file with arguments from for loop - python

My title might be misleading, as it's just a part of my question.
My current code is as shown below, it is not completed as i am not good with for loop
import os
import subprocess
listfile = []
path = "C:\\Users\\A\\Desktop\\Neuer Ordner (3)"
for f in os.listdir(path):
if f.endswith(".par"):
listfile.append(f)
print listfile
it's output is as shown below:
C:\app\Tools\exam\Python25>python nwe.py
['abc.par', 'abc2.par', 'abc3.par', 'abc4.par', 'abc5.par', 'abc6.par', 'abc7.pa
r', 'abc8.par', 'abc9.par']
Now i have to call a perl file, say "Newperl.pl" with arguments
arg1 as abc.par and arg2 as abc.svl
arg1 as abc2.par and arg2 as abc2.svl
arg1 as abc3.par and arg2 as abc3.svl
It has to take all the files which ends with ".par" extension and pass the the filename as argument while adding ".par" to the 1st argument and ".svl" to the 2nd argument.
i can use the one below for for loop.
for i in range(0,len(listfile), 1)
newcode = subprocess.call()
2nd line calls the perl file, but i am not sure as in how to split the ".par" and add ".par" and ".svl" again and pass each one at a time.
any logic or help would be appriciated

You could write your subprocess call as below:
import os
for l in listfile:
newcode = subprocess.call("perl newperl.pl "+l+" "+os.path.splitext(l)[0]+'.svl')
os.path.splittext remove the extension for par files and adds svl extension.

Related

How to print the first N lines of a file in python with N as argument

How would I go about getting the first N lines of a text file in python? With N have to give as argument
usage:
python file.py datafile -N 10
My code
import sys
from itertools import islice
args = sys.argv
print (args)
if args[1] == '-h':
print ("-N for printing the number of lines: python file.py datafile -N 10")
if args[-2] == '-N':
datafile = args[1]
number = int(args[-1])
with open(datafile) as myfile:
head = list(islice(myfile, number))
head = [item.strip() for item in head]
print (head)
print ('\n'.join(head))
I wrote the program, can let me know better than this code
Assuming that the print_head logic you've implemented need not be altered, here's the script I think you're looking for:
import sys
from itertools import islice
def print_head(file, n):
if not file or not n:
return
with open(file) as myfile:
head = [item.strip() for item in islice(myfile, n)]
print(head)
def parse_args():
result = {'script': sys.argv[0]}
args = iter(sys.argv)
for arg in args:
if arg == '-F':
result['filename'] = next(args)
if arg == '-N':
result['num_lines'] = int(next(args))
return result
if __name__ == '__main__':
script_args = parse_args()
print_head(script_args.get('filename', ''), script_args.get('num_lines', 0))
Running the script
python file.py -F datafile -N 10
Note: The best way to implement it would be to use argparse library
You can access argument passed to the script through sys
sys.argv
The list of command line arguments passed to a Python script. argv[0] is the script name (it is operating system dependent whether this is a full pathname or not). If the command was executed using the -c command line option to the interpreter, argv[0] is set to the string '-c'. If no script name was passed to the Python interpreter, argv[0] is the empty string.
So in code it would look like this:
import sys
print("All of argv")
print(sys.argv)
print("Last element every time")
print(sys.argv[-1])
Reading the documentation you'll see that the first values stored in the sys.argv vary according to how the user calls the script. If you print the code I pasted with different types of calls you can see for yourself the kind of values stored.
For a basic first approach: access n through sys.argv[-1] which returns the last element every time, assuming. You still have to do a try and beg for forgiveness to make sure the argument passed is a number. For that you would have:
import sys
try:
n = int(sys.argv[-1])
except ValueError as v_e:
print(f"Please pass a valid number as argument, not ${sys.argv[-1]}")
That's pretty much it. Obviously, it's quite basic, you can improve this even more by having the users pass values with flags, like --skip-lines 10 and that would be your n, and it could be in any place when executing the script. I'd create a function in charge of translating sys.argv into a key,value dictionary for easy access within the script.
Arguments are available via the sys package.
Example 1: ./file.py datafile 10
#!/usr/bin/env python3
import sys
myfile = sys.argv[1]
N = int(sys.argv[2])
with open("datafile") as myfile:
head = myfile.readlines()[0:args.N]
print(head)
Example 2: ./file.py datafile --N 10
If you want to pass multiple optional arguments you should have a look at the argparse package.
#!/usr/bin/env python3
import argparse
parser = argparse.ArgumentParser(description='Read head of file.')
parser.add_argument('file', help='Textfile to read')
parser.add_argument('--N', type=int, default=10, help='Number of lines to read')
args = parser.parse_args()
with open(args.file) as myfile:
head = myfile.readlines()[0:args.N]
print(head)

Running a python scripts and catching argument into another python script

I am unable to get this working .
I have something like this
python executor.py arg1 arg2 arg3
Ant then I have another python script which is not directly called by excecutor.py but by some another script file which is being called in executor.py .Lets it call
script.py
there is a variable name argument in which i want to catch arg1.
How to do do it ?
I am assuming you are doing import script within the executor.py script. If this is the case you have to add to executor.py:
import script
import sys
argument_1 = sys.argv[1] # since [0] is the script name
...
script.yourfunction(argument_1)
assuming script.py and executor.py are in the same folder:
script.py:
def function1(somearg, otherarg):
pass
def function2(moreargs):
pass
and executor.py
import sys
import script
# assign input args; check for valid arguments etc..
arg1 = sys.argv[1]
arg2 = sys.argv[2]
etc...
# call system functions
script.function1(arg1, arg2)
script.function2(arg1)

python script using subprocess, redirect ALL output to file

I am writing something for static analysis of source code in different languages. As anything has to be open source and callable from command line I now have downloaded one tool per language. So I decided to write a python script listing all source files in a project folder and calling the respective tool.
So part of my code looks like this:
import os
import sys
import subprocess
from subprocess import call
from pylint.lint import Run as pylint
class Analyser:
def __init__(self, source=os.getcwd(), logfilename=None):
# doing initialization stuff
self.logfilename = logfilename or 'CodeAnalysisReport.log'
self.listFiles()
self.analyseFiles()
def listFiles(self):
# lists all source files in the specified directory
def analyseFiles(self):
self.analysePythons()
self.analyseCpps()
self.analyseJss()
self.analyseJavas()
self.analyseCs()
if __name__ == '__main__':
Analyser()
Let's have at a look at the C++ files part (I use Cppcheck to analyse those):
def analyseCpps(self):
for sourcefile in self.files['.cc'] + self.files['.cpp']:
print '\n'*2, '*'*70, '\n', sourcefile
call(['C:\\CodeAnalysis\\cppcheck\\cppcheck', '--enable=all', sourcefile])
The console output for one of the files (it's just a random downloaded file) is:
**********************************************************************
C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc
Checking C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc...
[C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc:18]: (style) The scope of the variable 'oldi' can be reduced.
[C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc:43]: (style) The scope of the variable 'lastbit' can be reduced.
[C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc:44]: (style) The scope of the variable 'two_to_power_i' can be reduced.
(information) Cppcheck cannot find all the include files (use --check-config for details)
Line 1 and 2 coming from my script, lines 3 to 7 coming from Cppcheck.
And this is what I want to save to my log file, for all the other files too. Everything in one single file.
Of course I have searched SO and found some methods. But none is working completely.
First try:
Adding sys.stdout = open(self.logfilename, 'w') to my constructor. This makes line 1 and 2 of the above showed output be written to my log file. The rest is still shown on console.
Second try:
Additionaly, in analyseCpps I use:
call(['C:\CodeAnalysis\cppcheck\cppcheck', '--enable=all', sourcefile], stdout=sys.stdout)
This makes my log file to be:
Checking C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc...
**********************************************************************
C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc
and the console output is:
[C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc:18]: (style) The scope of the variable 'oldi' can be reduced.
[C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc:43]: (style) The scope of the variable 'lastbit' can be reduced.
[C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc:44]: (style) The scope of the variable 'two_to_power_i' can be reduced.
Not what I want.
Third try:
Using Popen with pipe. sys.stdout is back to default.
As preliminary work analyseCpps now is:
for sourcefile in self.files['.cc'] + self.files['.cpp']:
print '\n'*2, '*'*70, '\n', sourcefile
p = subprocess.Popen(['C:\\CodeAnalysis\\cppcheck\\cppcheck', '--enable=all', sourcefile], stdout=subprocess.PIPE)
p.stdout.read()
p.stdout.read() shows only the last line of my desired output (line 7 in code box 3)
Fourth try:
Using subprocess.Popen(['C:\CodeAnalysis\cppcheck\cppcheck', '--enable=all', sourcefile], stdout=open(self.logfilename, 'a+')) just writes the one line Checking C:\CodeAnalysis\testproject\cpp\BiggestUnInt.cc... to my logfile, the rest is shown on the console.
Fifth try:
Instead of subprocess.Popen I use os.system, so my calling command is:
os.system('C:\CodeAnalysis\cppcheck\cppcheck --enable=all %s >> %s' % (sourcefile, self.logfilename))
This results in the same log file as my fourth try. If I type the same command directly in the windows console the result is the same. So I guess it it is not exactly a python problem but still:
If it is on the console there must be a way to put it in a file. Any ideas?
E D I T
Foolish me. I'm still a noob so I forgot about the stderr. That's where the decisive messages are going to.
So now I have:
def analyseCpps(self):
for sourcefile in self.files['.cc'] + self.files['.cpp']:
p = subprocess.Popen(['C:\\CodeAnalysis\\cppcheck\\cppcheck', '--enable=all', sourcefile], stderr=subprocess.PIPE)
with open(self.logfilename, 'a+') as logfile:
logfile.write('%s\n%s\n' % ('*'*70, sourcefile))
for line in p.stderr.readlines():
logfile.write('%s\n' % line.strip())
and it's working fine.
ANOTHER EDIT
according to Didier's answer:
with sys.stdout = open(self.logfilename, 'w', 0) in my constructor:
def analyseCpps(self):
for sourcefile in self.files['.cc'] + self.files['.cpp']:
print '\n'*2, '*'*70, '\n', sourcefile
p = subprocess.Popen(['C:\\CodeAnalysis\\cppcheck\\cppcheck', '--enable=all', sourcefile], stdout=sys.stdout, stderr=sys.stdout)
There are several problems:
you should redirect both stdout and stderr
you should use unbuffered files if you want to mix normal print and the output of launched commands.
Something like this:
import sys, subprocess
# Note the 0 here (unbuffered file)
sys.stdout = open("mylog","w",0)
print "Hello"
print "-----"
subprocess.call(["./prog"],stdout=sys.stdout, stderr=sys.stdout)
print "-----"
subprocess.call(["./prog"],stdout=sys.stdout, stderr=sys.stdout)
print "-----"
print "End"
You need to redirect stderr too, you can use STDOUT or pass the file object to stderr=:
from subprocess import check_call,STDOUT
with open("log.txt","w") as f:
for sourcefile in self.files['.cc'] + self.files['.cpp']:
check_call(['C:\\CodeAnalysis\\cppcheck\\cppcheck', '--enable=all', sourcefile],
stdout=f, stderr=STDOUT)
Try to redirect stdout and stderr to a logfile:
import subprocess
def analyseCpps(self):
with open("logfile.txt", "w") as logfile:
for sourcefile in self.files['.cc'] + self.files['.cpp']:
print '\n'*2, '*'*70, '\n', sourcefile
call(['C:\\CodeAnalysis\\cppcheck\\cppcheck',
'--enable=all', sourcefile], stdout=logfile,
stderr=subprocess.STDOUT)
In this example the filename is hardcoded, but you should be able to change that easily (to your self.logfilename or similar).

Parsing cmd args like typical filter programs

I spent few hours reading tutorials about argparse and managed to learn to use normal parameters. The official documentation is not very readable to me. I'm new to Python. I'm trying to write a program that could be invoked in following ways:
cat inFile | program [options] > outFile -- If no inFile or outfile is specified, read from stdin and output to stdout.
program [options] inFile outFile
program [options] inFile > outFile -- If only one file is specified it is input and output should go to stdout.
cat inFile | program [options] - outFile -- If '-' is given in place of inFlie read from stdin.
program [options] /path/to/folder outFile -- Process all files from /path/to/folder and it subdirectories.
I want it to behave like regular cli program under GNU/Linux.
It would be also nice if the program would be able to be invoked:
program [options] inFile0 inFile1 ... inFileN outFile -- first path/file always interpreted as input, last one always interpreted as output. Any additional ones interpreted as inputs.
I could probably write dirty code that would accomplish this but this is going to be used, so someone will end up maintaining it (and he will know where I live...).
Any help/suggestions are much appreciated.
Combining answers and some more knowledge from the Internet I've managed to write this(it does not accept multiple inputs but this is enough):
import sys, argparse, os.path, glob
def inputFile(path):
if path == "-":
return [sys.stdin]
elif os.path.exists(path):
if os.path.isfile(path):
return [path]
else:
return [y for x in os.walk(path) for y in glob.glob(os.path.join(x[0], '*.dat'))]
else:
exit(2)
def main(argv):
cmdArgsParser = argparse.ArgumentParser()
cmdArgsParser.add_argument('inFile', nargs='?', default='-', type=inputFile)
cmdArgsParser.add_argument('outFile', nargs='?', default='-', type=argparse.FileType('w'))
cmdArgs = cmdArgsParser.parse_args()
print cmdArgs.inFile
print cmdArgs.outFile
if __name__ == "__main__":
main(sys.argv[1:])
Thank you!
You need a positional argument (name not starting with a dash), optional arguments (nargs='?'), a default argument (default='-'). Additionally, argparse.FileType is a convenience factory to return sys.stdin or sys.stdout if - is passed (depending on the mode).
All together:
#!/usr/bin/env python
import argparse
# default argument is sys.argv[0]
parser = argparse.ArgumentParser('foo')
parser.add_argument('in_file', nargs='?', default='-', type=argparse.FileType('r'))
parser.add_argument('out_file', nargs='?', default='-', type=argparse.FileType('w'))
def main():
# default argument is is sys.argv[1:]
args = parser.parse_args(['bar', 'baz'])
print(args)
args = parser.parse_args(['bar', '-'])
print(args)
args = parser.parse_args(['bar'])
print(args)
args = parser.parse_args(['-', 'baz'])
print(args)
args = parser.parse_args(['-', '-'])
print(args)
args = parser.parse_args(['-'])
print(args)
args = parser.parse_args([])
print(args)
if __name__ == '__main__':
main()
I'll give you a start script to play with. It uses optionals rather than positionals. and only one input file. But it should give a taste of what you can do.
import argparse
parser = argparse.ArgumentParser()
inarg = parser.add_argument('-i','--infile', type=argparse.FileType('r'), default='-')
outarg = parser.add_argument('-o','--outfile', type=argparse.FileType('w'), default='-')
args = parser.parse_args()
print(args)
cnt = 0
for line in args.infile:
print(cnt, line)
args.outfile.write(line)
cnt += 1
When called without arguments, it just echos your input (after ^D). I'm a little bothered that it doesn't exit until I issue another ^D.
FileType is convenient, but has the major fault - it opens the files, but you have to close them yourself, or let Python do so when exiting. There's also the complication that you don't want to close stdin/out.
The best argparse questions include a basic script, and specific questions on how to correct or improve it. Your specs are reasonably clear. but it would be nice if you gave us more to work with.
To handle the subdirectories option, I would skip the FileType bit. Use argparse to get 2 lists of strings (or a list and an name), and then do the necessary chgdir and or glob to find and iterate over files. Don't expect argparse to do the actual work. Use it to parse the commandline strings. Here a sketch of such a script, leaving most details for you to fill in.
import argparse
import os
import sys # of stdin/out
....
def open_output(outfile):
# function to open a file for writing
# should handle '-'
# return a file object
def glob_dir(adir):
# function to glob a dir
# return a list of files ready to open
def open_forread(afilename):
# function to open file for reading
# be sensitive to '-'
def walkdirs(alist):
outlist = []
for name in alist:
if <name is file>;
outlist.append(name)
else <name is a dir>:
glist = glob(dir)
outlist.extend(glist)
else:
<error>
return outlist
def cat(infile, outfile):
<do your thing here>
def main(args):
# handle args options
filelist = walkdirs(args.inlist)
fout = open_outdir(args.outfile)
for name in filelist:
fin = open_forread(name)
cat(fin,fout)
if <fin not stdin>: fin.close()
if <fout not stdout>: fout.close()
if '__name__' == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('inlist', nargs='*')
parser.add_argument('outfile')
# add options
args = parser.parse_args()
main(args)
The parser here requires you to give it an outfile name, even if it is '-'. I could define its nargs='?' to make it optional. But that does not play nicely with the 'inlist` '*'.
Consider
myprog one two three
Is that
namespace(inlist=['one','two','three'], outfile=default)
or
namespace(inlist=['one','two'], outfile='three')
With both a * and ? positional, the identity of the last string is ambiguous - is it the last entry for inlist, or the optional entry for outfile? argparse chooses the former, and never assigns the value to outfile.
With --infile, --outfile definitions, the allocation of these strings is clear.
In sense this problem is too complex for argparse - there's nothing in it to handle things like directories. In another sense it is too simple. You could just as easily split sys.argv[1:] between inlist and outfile without the help of argparse.

Quoted string as optional argument to Python script

I have simple utility script that I use to download files given a URL. It's basically just a wrapper around the Linux binary "aria2c".
Here is the script named getFile:
#!/usr/bin/python
#
# SCRIPT NAME: getFile
# PURPOSE: Download a file using up to 20 concurrent connections.
#
import os
import sys
import re
import subprocess
try:
fileToGet = sys.argv[1]
if os.path.exists(fileToGet) and not os.path.exists(fileToGet+'.aria2'):
print 'Skipping already-retrieved file: ' + fileToGet
else:
print 'Downloading file: ' + fileToGet
subprocess.Popen(["aria2c-1.8.0", "-s", "20", str(fileToGet), "--check-certificate=false"]).wait() # SSL
except IndexError:
print 'You must enter a URI.'
So, for example, this command would download a file:
$ getFile http://upload.wikimedia.org/wikipedia/commons/8/8e/Self-portrait_with_Felt_Hat_by_Vincent_van_Gogh.jpg
What I want to do is permit an optional second argument (after the URI) that is a quoted string. This string will be the new filename of the downloaded file. So, after the download finishes, the file is renamed according to the second argument. Using the example above, I would like to be able to enter:
$ getFile http://upload.wikimedia.org/wikipedia/commons/8/8e/Self-portrait_with_Felt_Hat_by_Vincent_van_Gogh.jpg "van-Gogh-painting.jpg"
But I don't know how to take a quoted string as an optional argument. How can I do this?
Just test the length of sys.argv; if it is more than 2 you have an extra argument:
if len(sys.argv) > 2:
filename = sys.argv[2]
The shell will pass it as a second argument (normally) if you provide spaces between them.
For example, here is test.py:
import sys
for i in sys.argv:
print(i)
And here is the result:
$ python test.py url "folder_name"
test.py
url
folder_name
The quotes doesn't matter at all, as it's handled in the shell, not python. To get it, just take sys.argv[2].
Hope this helps!

Categories