I am not sure why the code below does not work - I get the error
NameError: name 'group1' is not defined.
The code worked fine before I tried to use getopt.. I am trying to parse the command line input so that eg if I put
python -q file1 file2 -r file3 file4
the file1 and file2 become the input into my first loop as 'group1'.
import sys
import csv
import vcf
import getopt
#set up the args
try:
opts, args = getopt.getopt(sys.argv[1:], 'q:r:h', ['query', 'reference', 'help'])
except getopt.GetoptError as err:
print str(err)
sys.exit(2)
for opt, arg in opts:
if opt in ('-h', '--help'):
print "Usage python -q [query files] -r [reference files]"
print "-h this help message"
elif opt in ('-q', '--query'):
group1 = arg
elif opt in ('-r', '--reference'):
group2 = arg
else:
print"check your args"
#extract core snps from query file, saving these to the set universal_snps
snps = []
outfile = sys.argv[1]
for variants in group1:
vcf_reader = vcf.Reader(open(variants))
The problem is that group1 = arg is never running, so when it later gets to for variants in group1:, the variable is not defined.
This is because you are calling the function incorrectly for how you defined your options. When you have the line:
opts, args = getopt.getopt(sys.argv[1:], 'q:r:h', ['query', 'reference', 'help'])
There is a requirement that the arguments with flags (i.e. -q file1 and -r file3 be specified before any other arguments. Therefore, if you were to call the function as:
python <scriptName> -q file1 -r file3 file2 file4
You would have the intended behaviour. This is because all the parameters without an associated flag appear at the end of the call (and would be retrievable through the args parameter
Related
My Problem is to add to every single block of code a language in my markdown files.
I've hundreds of files in nested directories.
The files have this form:
```language
a
```
Normal text
```
b
```
Normal text
```
c
```
Normal text
```language
d
```
and the output for each of these shoud be:
```ios
a
```
Normal text
```ios
b
```
Normal text
```ios
c
```
Normal text
```ios
d
```
(In this case I needed ios lang from a custom lexer I made)
I'm using debian 11 and trying with sed and I found that this regex
(```).*(\n.*)((\n.*)*?)\n```
could help find the blocks but can't find how to use it.
I can use python for more complex regex and behaviour.
My Solution
WARNING!! If you have impared triple-backtick, this code will have unwanted results! always backup your files before!
bash find all files with absolute path (for some reason I don't like relative paths, and my laziness told me not to write a recursive python search :D)
-exec python script with 2 arguments (filename and a second parameter to append a string to original file and keep it, having new one with original filename)
The regex inside the python script I came up with to "add" (I actually replace the whole..) the "ios" text for code block is:
(```).*(\n.*)((\n.*)*?)\n```
replace with
\1ios\2\3\n```
I really couldn't transform this for sed
import re
import sys, getopt
from shutil import move
def main(argv):
inputfile = ''
outputfile = ''
try:
opts, args = getopt.getopt(argv,"hi:a:",["ifile=","afile="])
except getopt.GetoptError:
print ('pyre.py -i <inputfile> -a <append_string>')
sys.exit(2)
for opt, arg in opts:
if opt == '-h':
print ('pyre.py -i <inputfile> -a <append_string>')
sys.exit()
elif opt in ("-i", "--ifile"):
inputfile = arg
elif opt in ("-a", "--afile"):
outputfile = inputfile + arg
magic(inputfile, outputfile)
def magic(inputfile, outputfile):
regex = r"(```).*(\n.*)((\n.*)*?)\n```"
subst = r"\1ios\2\3\n```"
move(inputfile, outputfile)
open(inputfile, 'w', encoding="utf-8").write(re.sub(regex, subst, open(outputfile, 'r', encoding="utf-8").read(), 0, re.MULTILINE))
#print(f"{inputfile} DONE")
if __name__ == "__main__":
main(sys.argv[1:])
and the actully find
find ~+ -name '*.md' -exec python pyre.py -i \{\} -a .new.md \;
Hope this will help someone with my same issue.
I am trying to take multiple files as input from terminal. the input number may vary from atleast 1 to many. Here is the input for my program
F3.py -e <Energy cutoff> -i <inputfiles>
I want the parameter -i to take any number of values from 1 to multiple.e.g.
F3.py -e <Energy cutoff> -i file1 file2
F3.py -e <Energy cutoff> -i *.pdb
Right now it takes only the first file and then stops.
This is what I have so far:
def main(argv):
try:
opts,args=getopt.getopt(argv,"he:i:")
for opt,arg in opts:
if opt=="-h":
print 'F3.py -e <Energy cutoff> -i <inputfiles>'
sys.exit()
elif opt == "-e":
E_Cut=float(arg)
print 'minimum energy=',E_Cut
elif opt == "-i":
files.append(arg)
print files
funtction(files)
except getopt.GetoptError:
print 'F3.py -e <Energy cutoff> -i <inputfiles>'
sys.exit(2)
Any help would be appreciated. Thanks
Try using the #larsks suggestion, the next snippet should work for your use case:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-i', '--input', help='Input values', nargs='+', required=True)
args = parser.parse_args()
print args
kwargs explanation:
nargs allows you to parse the values as a list, so you can iterate over using something like: for i in args.input.
required makes this argument mandatory, so you must add at least one element
By using the argparse module you also got the -h option to describe your params. So try using:
$ python P3.py -h
usage: a.py [-h] -i INPUT [INPUT ...]
optional arguments:
-h, --help show this help message and exit
-i INPUT [INPUT ...], --input INPUT [INPUT ...]
Input values
$ python P3.py -i file1 file2 filen
Namespace(input=['file1', 'file2', 'filen'])
If you insist on using getopt you will have to combine multiple argument with delimeter other than space like , and then modify your code accordingly like this
import getopt
import sys
try:
opts,args=getopt.getopt(sys.argv[1:],"he:i:")
for opt,arg in opts:
if opt=="-h":
print 'F3.py -e <Energy cutoff> -i <inputfiles>'
sys.exit()
elif opt == "-e":
E_Cut=float(arg)
print 'minimum energy=',E_Cut
elif opt == "-i":
files = arg.split(",")
print files
#funtction(files)
except getopt.GetoptError:
print 'F3.py -e <Energy cutoff> -i <inputfiles>'
sys.exit(2)
When you run this you will get output
>main.py -e 20 -i file1,file2
minimum energy= 20.0
['file1', 'file2']
NOTE
I have commented your function call and removed unwrap your code from main function, you can redo these things in your code it will not change your result.
This is my first attempt at using commandline args other than the quick and dirty sys.argv[] and writing a more 'proper' python script. For some reason that I can now not figure out, it seems to be objecting to how I'm trying to use the input file from the command line.
The script is meant to take an input file, some numerical indices, and then slice out a subset region of the file, however I keep getting errors that the variable I've given to the file I'm passing in is not defined:
joehealey#7c-d1-c3-89-86-2c:~/Documents/Warwick/PhD/Scripts$ python slice_genbank.py --input PAU_06042014.gbk -o test.gbk -s 3907329 -e 3934427
Traceback (most recent call last):
File "slice_genbank.py", line 70, in <module>
sub_record = record[start:end]
NameError: name 'record' is not defined
Here's the code, where am I going wrong? (I'm sure its simple):
#!/usr/bin/python
# This script is designed to take a genbank file and 'slice out'/'subset'
# regions (genes/operons etc.) and produce a separate file.
# Based upon the tutorial at http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc44
# Set up and handle arguments:
from Bio import SeqIO
import getopt
def main(argv):
record = ''
start = ''
end = ''
try:
opts, args = getopt.getopt(argv, 'hi:o:s:e:', [
'help',
'input=',
'outfile=',
'start=',
'end='
]
)
if not opts:
print "No options supplied. Aborting."
usage()
sys.exit(2)
except getopt.GetoptError:
print "Some issue with commandline args.\n"
usage()
sys.exit(2)
for opt, arg in opts:
if opt in ("-h", "--help"):
usage()
sys.exit(2)
elif opt in ("-i", "--input"):
filename = arg
record = SeqIO.read(arg, "genbank")
elif opt in ("-o", "--outfile"):
outfile = arg
elif opt in ("-s", "--start"):
start = arg
elif opt in ("-e", "--end"):
end = arg
print("Slicing " + filename + " from " + str(start) + " to " + str(end))
def usage():
print(
"""
This script 'slices' entries such as genes or operons out of a genbank,
subsetting them as their own file.
Usage:
python slice_genbank.py -h|--help -i|--input <genbank> -o|--output <genbank> -s|--start <int> -e|--end <int>"
Options:
-h|--help Displays this usage message. No options will also do this.
-i|--input The genbank file you which to subset a record from.
-o|--outfile The file name you wish to give to the new sliced genbank.
-s|--start An integer base index to slice the record from.
-e|--end An integer base index to slice the record to.
"""
)
#Do the slicing
sub_record = record[start:end]
SeqIO.write(sub_record, outfile, "genbank")
if __name__ == "__main__":
main(sys.argv[1:])
It's also possible there's an issue with the SeqIO.write syntax, but I haven't got as far as that yet.
EDIT:
Also forgot to mention that when I use `record = SeqIO.read("file.gbk", "genbank") and write the file name directly in to the script, it works correctly.
As said in the comments, your variable records is only defined in the method main() (the same is true for start and end), thus it is not visible for the rest of the program.
You can either return the values like this:
def main(argv):
...
...
return record, start, end
Your call to main() can then look like this:
record, start, end = main(sys.argv[1:])
Alternatively, you can move your main functionality into the main function (as you did).
(Another way is to define the variables in the main program and the use the global keyword in your function, this is, however, not recommended.)
I modified the sample code given here:
sample code for getopt
as follows, but it does not work. I am not sure what I am missing. I added a "-j" option to this existing code. Eventually, I want to add as many as required command option to meet my needs.
When I give input as below, it does not print anything.
./pyopts.py -i dfdf -j qwqwqw -o ddfdf
Input file is "
J file is "
Output file is "
Can you please let me know whats wrong here?
#!/usr/bin/python
import sys, getopt
def usage():
print 'test.py -i <inputfile> -j <jfile> -o <outputfile>'
def main(argv):
inputfile = ''
jfile = ''
outputfile = ''
try:
opts, args = getopt.getopt(argv,"hij:o:",["ifile=","jfile=","ofile="])
except getopt.GetoptError:
usage()
sys.exit(2)
for opt, arg in opts:
if opt == '-h':
usage()
sys.exit()
elif opt in ("-i", "--ifile"):
inputfile = arg
elif opt in ("-j", "--jfile"):
jfile = arg
elif opt in ("-o", "--ofile"):
outputfile = arg
print 'Input file is "', inputfile
print 'J file is "', jfile
print 'Output file is "', outputfile
if __name__ == "__main__":
main(sys.argv[1:])
Your error is omitting a colon following the i option. As stated by the link you supplied:
options that require an argument should be followed by a colon (:).
Therefore, the corrected version of your program should contain the following:
try:
opts, args = getopt.getopt(argv,"hi:j:o:",["ifile=","jfile=","ofile="])
except getopt.GetoptError:
usage()
sys.exit(2)
Executing it with the specified arguments derives the expected output:
~/tmp/so$ ./pyopts.py -i dfdf -j qwqwqw -o ddfdf
Input file is " dfdf
J file is " qwqwqw
Output file is " ddfdf
However, as a comment to your question specifies, you should use argparse rather than getopt:
Note: The getopt module is a parser for command line options whose API is designed to be familiar to users of the C getopt() function. Users who are unfamiliar with the C getopt() function or who would like to write less code and get better help and error messages should consider using the argparse module instead.
"""
Saves a dir listing in a file
Usage: python listfiles.py -d dir -f filename [flags]
Arguments:
-d, --dir dir; ls of which will be saved in a file
-f, --file filename (if existing will be overwritten)
Flags:
-h, --help show this help
-v, --verbose be verbose
"""
...
def usage():
print __doc__
def main(args):
verbose = False
srcdir = filename = None
try:
opts, args = getopt.getopt(args,
'hvd:f:', ['help', 'verbose', 'dir=', 'file='])
except getopt.GetoptError:
usage()
sys.exit(2)
for opt, arg in opts:
if opt in ('-h', '--help'):
usage()
sys.exit(0)
if opt in ('-v', '--verbose'):
verbose = True
elif opt in ('-d', '--dir'):
srcdir = arg
elif opt in ('-f', '--file'):
filename = arg
if srcdir and filename:
fsock = open(filename, 'w')
write_dirlist_tosock(srcdir, fsock, verbose)
fsock.close()
else:
usage()
sys.exit(1)
if __name__ == '__main__':
main(sys.argv[1:])
I am not sure if it is pythonic to use getopt() to also handle mandatory arguments. Would appreciate some suggestions
the getopt module is only for those users who are already familiar with the same module in C, the python standard argument handling is argparse.
"Mandatory Options" is a contradiction, and is not generally well supported by the various option parsing libraries; You should consider placing mandatory arguments as a positional arguments, not parsed by the option parser, this would agree with common practice much better.