I'm having two problems with arguments in my program, the first problem is that I'm trying to print an error if no arguments are passed to the program and also I'm trying to instead having to use -n which stands for 'no argument' to actually not have to pass any argument to load the file into the program, I want it to just run like python3 program.py file file2 file3 instead of using python3 -n file file2 file3 etc.. I have commented out what I tried to check for the argument if the argument is just the program file [0] to exit
def main():
script = sys.argv[0]
action = sys.argv[1]
noargfile = sys.argv[1:]
filenames = sys.argv[2:]
OutContent = filenames or noargfile
#Load files with arguments -d & --default
print("Loading Files....", sys.argv[1:])
for arg in filenames:
try:
myfile = open(arg, "r")
fileContent = myfile.readlines()
myfile.close()
OutContent = OutContent + fileContent
#if len(sys.argv) == script:
#print("No Argument")
#sys.exit(0)
if action == '--default':
counter = 0 # set a counter to 0
for line in OutContent: #for each line in load if the " 200 " is found add 1 to the counter and repeat until done.
if re.findall(r"\s\b200\b\s", line):
counter += 1
print("\nTotal of (Status Code) 200 request:", counter)
elif action == '-d':
counter = 0 # set a counter to 0
for line in OutContent: #for each line in load if the " 200 " is found add 1 to the counter and repeat until done.
if re.findall(r"\s\b200\b\s", line):
counter += 1
print("\nTotal of (Status Code) 200 request:", counter)
elif action == '-n':
menu(arg, OutContent)
except OSError:
print("File could not be opened " + filenames)
if __name__ == "__main__":
main()
I get an index out of range error, I don't understand why
File "program.py", line 161, in main
action = sys.argv[1]
IndexError: list index out of range
Add this to be first line in function main:
if len(sys.argv)==1: sys.exit("error here")
You shouldn't doing argument parsing yourself when there are already very good argument parses out there (there are probably 100 on pypy)
This little example uses argparse module. It takes n number of files and stores it as a list of strings in the variable files
import argparse
parser = argparse.ArgumentParser(description='Load some files')
parser.add_argument('-f','--files', dest='files', nargs='+', help='<Required> Set flag', required=True)
args = parser.parse_args()
print args.files
Usage:
python myscript -f test1.txt test2.txt test3.txt
Here are more details on how to add more functionalities like help pages or make required|optional fields. https://docs.python.org/2/library/argparse.html
Related
I have my config encoded here:
#staticmethod
def getConfig(env):
pwd=os.getcwd()
if "win" in (platform.system().lower()):
f = open(pwd+"\config_"+env.lower()+"_data2.json")
else:
f = open(pwd+"/config_"+env.lower()+"_data2.json")
config = json.load(f)
f.close()
return config
#staticmethod
def isWin():
if "win" in (platform.system().lower()):
return True
else:
return False
I have 2 JSON files I want my script to read, but the way it's written above it only reads 1 of them. I want to know how to change it to something like:
f = open(pwd+"\config_"+env.lower()+"_data_f'{}'.json")
so it can read either dataset1.config or dataset2.config. I'm not sure if this is possible, but I want to do that so I can specify which file to run in the command line: python datascript.py -f dataset1.config or python datascript.py -f dataset2.config. Do I assign that entire open() call to a variable?
All you need to do is parse sys.argv to get the argument of the -f flag, then concatenate the strings and pass the result to open(). Try this:
import sys
### ... more code ...
#staticmethod
def getConfig(env):
pwd = os.getcwd()
file = None
try:
file = sys.argv[sys.argv.index('-f')+1]
except ValueError:
file = "data2.json"
if "win" in (platform.system().lower()):
f = open(pwd+"\config_"+env.lower()+"_" + file)
else:
f = open(pwd+"/config_"+env.lower()+"_" + file)
config = json.load(f)
f.close()
return config
sys.argv.index('-f') gives the index of -f in the command line arguments, so the argument must be filename. The try-except statement will provide a default value if no -f argument is given.
I have certain data in a json file (say, example.json),
example.json
data = {
'name' : 'Williams',
'working': False,
'college': ['NYU','SU','OU'],
'NYU' : {
'student' : True,
'professor': False,
'years' : {
'fresher' : '1',
'sophomore': '2',
'final' : '3'
}
}
}
I wish to write a code wherein I can give the arguments on Command line, i.e. suppose if a script is saved in a file 'script.py', then,
In the terminal: If I enter *$ python3* script.py --get name --get NYU.student Then it outputs name=Williams
NYU.student=True
If I enter *$ python3* script.py' --set name=Tom --set NYU.student=False
Then, it updates name and NYU.student keys in the dictionay to Tom and False and outputs NYU.student=Tom and NYU.student=False on the command line.
I have tried the following code for the python script (i.e. script.py)
script.py
import json
import pprint
import argparse
if __name__== "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--get", help="first command")
parser.add_argument("--set", help="second command")
args=parser.parse_args()
with open('example.json','r') as read_file:
data=json.load(read_file)
if args.set == None:
key = ' '.join(args.get[:])
path = key.split('.')
now = data
for k in path:
if k in now:
now = now[k]
else:
print('Error: Invalid Key')
print(now)
elif args.get == Null:
key, value = ' '.join(args.set[:]).split('=')
path = key.split('.')
now = data
for k in path[:-1]:
if k in now:
now = now[k]
else:
print('Error: Invalid Key')
now[path[-1]] = value
with open('example.json','w') as write_file: #To write the updated data back to the same file
json.dump(data,write_file,indent=2)
However, my script is not working as I expect it to? Kindly, help me with the script
Your code has the following issues:
When joining the argument values in line number 23 and 35, you use a space. This leads to the "Error key" value. Removing the space will solve the issue.
key = ''.join(arg[:])
You defined the arguments to only pass one value. Not multiple. Therefore even if you pass multiple --get or --set values, the script only gets one value. Adding action="append" to line number 9 and 10 will solve the issue.
parser.add_argument("--get", help="first command", action="append")
parser.add_argument("--set", help="second command", action="append")
Full code:
import json
import pprint
import argparse
if __name__== "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--get", help="first command", action="append")
parser.add_argument("--set", help="second command", action="append")
args=parser.parse_args()
try:
with open('example.json','r') as read_file:
data=json.load(read_file)
except IOError:
print("ERROR: File not found")
exit()
if args.set == None:
for arg in args.get:
key = ''.join(arg[:])
path = key.split('.')
now = data
for k in path:
if k in now:
now = now[k]
else:
print('Error: Invalid Key')
print(f"{arg} = {now}")
elif args.get == None:
for arg in args.set:
key, value = ''.join(arg[:]).split('=')
path = key.split('.')
now = data
for k in path[:-1]:
if k in now:
now = now[k]
else:
print('Error: Invalid Key')
print(f"{arg}")
now[path[-1]] = value
with open('example.json','w') as write_file: #To write the updated data back to the same file
json.dump(data,write_file,indent=2)
here is the get part of the question, I hope that you can continue the set part of your assignment. good luck
python test.py --get name NYU.student
import json
import pprint
import argparse
def match(data: dict, filter: str):
current = data
for f in filter.split("."):
if f not in current:
return False
current = current[f]
return current == True
if __name__== "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--get", nargs="*", help="first command")
args = parser.parse_args()
with open('example.json','r') as f:
data = json.loads(f.read())
if args.get is not None and len(args.get) == 2:
attr_name = args.get[0]
if match(data, args.get[1]):
print("{}={}".format(attr_name, data[attr_name]))
In order to pass arguments using command line make use of sys module in python3. The sys module reads the command line arguments as a list of strings. The first element in the list is always the name of the file and subsequent elements are arg1, arg2 ..... so on.
Hope the following example helps to understand the usage of sys module.
Example Command :
python filename.py 1 thisisargument2 4
The corresponding code
import sys
# Note that all the command line args will be treated as strings
# Thus type casting will be needed for proper usage
print(sys.argv[0])
print(sys.argv[1])
print(sys.argv[2])
print(sys.argv[3])
Corresponding Output
filename.py
1
thisisargument2
4
Also please make a thorough google search before posting a question on stackoverflow.
I am trying to implement python mapreduce for multiple files in directory, so that it will take folder and string as an argument and list files with the frequency of that string within that files. The output should be like that:
Filename Output
-------- --------------
x.txt 8
y.txt 12
I have tried to implement it but when I run it with command below:
cat /home/habil/Downloads/hadoop_test/*.txt | python mapper.py "AA" | python reducer.py
It gives me "AA 479" which are the frequency in all 5 files
This is my mapper.py
#!/usr/bin/env python
import sys
import textwrap
from os import listdir
from os.path import isfile, join
#Argument of the path
#folderPath = sys.argv[2]
#onlyfiles = [f for f in listdir(sys.argv[2]) if isfile(join(sys.argv[2], f))]
# Get the string sequence from the user
searchWord = sys.argv[1]
# Length of the word
searchWordLength = len(sys.argv[1])
# helper Function
def locations_of_substring(string, substring):
"""Return a list of locations of a substring."""
substring_length = len(substring)
def recurse(locations_found, start):
location = string.find(substring, start)
if location != -1:
return recurse(locations_found + [location], location+substring_length)
else:
return locations_found
return recurse([], 0)
#--- get all lines from stdin ---
for line in sys.stdin:
#--- remove leading and trailing whitespace---
line = line.strip()
temp = locations_of_substring(line, searchWord)
if len(temp) != 0:
for count in temp:
print '%s\t%s' % (line[count:count+searchWordLength], "1")
And below is my reducer:
#!/usr/bin/env python
import sys
# maps words to their counts
word2count = {}
# input comes from STDIN
for line in sys.stdin:
# remove leading and trailing whitespace
line = line.strip()
# parse the input we got from mapper.py
word, count = line.split('\t', 1)
# convert count (currently a string) to int
try:
count = int(count)
except ValueError:
continue
try:
word2count[word] = word2count[word]+count
except:
word2count[word] = count
# write the tuples to stdout
# Note: they are unsorted
for word in word2count.keys():
print '%s\t%s'% ( word, word2count[word])
How can I get the desired result, so that it will run for each file in the directory, once and print seperate results. Any help or hint is appreciated. Thanks in advance.
I am trying to create a python script where the user can type in their FASTA file and that file will then be parsed using Biopython. I am struggling to get this to work. The script I have thus far is this:
#!/usr/bin/python3
file_name = input("Insert full file name including the fasta extension: ")
with open(file_name, "r") as inf:
seq = inf.read()
from Bio.SeqIO.FastaIO import SimpleFastaParser
count = 0
total_len = 0
with open(inf) as in_file:
for title, seq in SimpleFastaParser(in_file):
count += 1
total_len += len(seq)
print("%i records with total sequence length %i" % (count, total_len))
I would like the user to be prompted to type in their file and its extension and that file should be used to parse with Biopython such that that output is printed. I also want to be about to send the print output to a log file. Any help would be appreciate.
The purpose of the script is to take a fasta file, parse and trim primers. I know there is an easy method to do this using Biopython entirely but as per instruction Biopython can only be used to parse not trim. Any insight into this would be appreciated as well.
Firstly, you have two places where you open the fasta file
One where you store the contents in seq
Then you try to open inf, but you don't assign inf as a variable in this code snippet.
You may want to include some check to makes sure a valid file path was used
Also, this is a good case for using argparse:
#!/usr/bin/python3
import argparse
from Bio.SeqIO
import os
import sys
def main(infile):
# check that the file exists
if not os.path.is_file(infile):
print("file not found")
sys.exit()
count = 0
total_len = 0
for seq_record in SeqIO.parse(open(infile), "fasta"):
count += 1
total_len += len(seq_record.seq)
print("%i records with total sequence length %i" % (count, total_len))
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='some script to do something with fasta files',
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('-in', '--infile', type=str, default=None, required=True,
help='The path and file name of the fasta file')
args = parser.parse_args()
infile = args.infile
main(infile)
If you need to use input:
#!/usr/bin/python3
from Bio.SeqIO
import os
import sys
infile = input("Insert full file name including the fasta extension: ")
# remove any white space
infile = infile.strip()
# check that the file exists
if not os.path.is_file(infile):
print("file not found")
sys.exit()
count = 0
total_len = 0
for seq_record in SeqIO.parse(open(infile), "fasta"):
count += 1
total_len += len(seq_record.seq)
print("%i records with total sequence length %i" % (count, total_len))
Okay, So I wanted to dynamically either take input from STDIN or from a file, depending on the options given on the command line. In the end I came up with this code:
# Process command-line options
# e.g., python3 trowley_FASTAToTab -i INFILE -o OUTFILE -s
try:
opts, args = getopt.getopt(sys.argv[1:], 'i:o:s')
except getopt.GetoptError as err:
# Redirect STDERR to STDOUT (insures screen display)
sys.stdout = sys.stderr
# Print help information
print(str(err))
# Print usage information
usage()
# Exit
sys.exit(2)
# Define our variables
inFile = "" # File to be read, if there is one
outFile = "" # Outfile to write to, if there is one
keepSeq = False # Whether or not to keep the sequence
header = "" # The header line we will mess with
sequence = "" # The sequence were messing with
# Parse command line options
for (opt, arg) in opts:
if opt == "-i":
inFile = str(arg)
elif opt == "-o":
outFile = str(arg)
elif opt == "-s":
keepSeq = True
# Lets open our outfile or put the variable to stdout
if not outFile:
outFile = sys.stdout
else:
outFile = open(outFile, 'w')
for line in sys.stdin:
# Do tons of stuff here
print(tmpHeader, file=outFile)
# Maybe some cleanup here
Which has some odd behavour. If i specify an infile, but no outfile, it will read that file, do the stuff, and then output the result to the screen (what it's supposed to do).
If I leave off both the infile and the outfile (so input would be from stdin), once i press ctrl d (end of input), it just does nothing, exits the script. Same thing when I take the input from stdin and write to a file, just wont do anything.
I ended up fixing it by using:
while (1):
line = inFile.readline()
if not line:
break
# Do all my stuff here
My question is, Why wont the for line in inFile work? I get no errors, just nothing happens. Is there some unspecified rule i'm breaking?