I need to create a program called extractGenes.py
The command line parameters need to take 2 OR 3 parameters:
-s is an optional parameter, or switch, indicating that the user wwants the spliced gene sequence(introns removed). The user does not have to provide this (meaning he wants the entire gene sequence), but it he does provide it then it must be the first parameter
input file (with the genes)
output file (where the program will create to store the fasta file
The file contains lines like this:
NM_001003443 chr11 + 5925152 592608098 2 5925152,5925652, 5925404,5926898,
However, I am not sure how to include the -s optional parameter into the starting function.
So I started with:
getGenes(-s, input, output):
fp = open(input, 'r')
wp = open(output, "w")
but am unsure as to how to include the -s.
This case is simple enough to use sys.argv directly:
import sys
spliced = False
if '-s' in sys.argv:
spliced = True
sys.argv.remove('-s')
infile, outfile = sys.argv[1:]
Alternatively, you can also use the more powerful tools like argparse and optparse to generate a command-line parser:
import argparse
parser = argparse.ArgumentParser(description='Tool for extracting genes')
parser.add_argument('infile', help='source file with the genes')
parser.add_argument('outfile', help='outfile file in a FASTA format')
parser.add_argument('-s', '--spliced', action='store_true', help='remove introns')
if __name__ == '__main__':
result = parser.parse_args('-s myin myout'.split())
print vars(result)
Argparse is a python library that willl take care of optional paremeters for you. http://docs.python.org/library/argparse.html#module-argparse
Try something like this:
def getGenes(input, output, s=False):
if s:
...
else:
...
If you input 2 parameters, s will be False;
getGenes(input, output)
If you call getGenes() with 3 parameters, s will be the 3rd parameter, so in this case calling it with any non False value will yield the else clause.
Related
Trying to make an argument in argparse where one can input several file names that can be read.
In this example, i'm just trying to print each of the file objects to make sure it's working correctly but I get the error:
error: unrecognized arguments: f2.txt f3.txt
. How can I get it to recognize all of them?
my command in the terminal to run a program and read multiple files
python program.py f1.txt f2.txt f3.txt
Python script
import argparse
def main():
parser = argparse.ArgumentParser()
parser.add_argument('file', nargs='?', type=file)
args = parser.parse_args()
for f in args.file:
print f
if __name__ == '__main__':
main()
I used nargs='?' b/c I want it to be any number of files that can be used . If I change add_argument to:
parser.add_argument('file', nargs=3)
then I can print them as strings but I can't get it to work with '?'
If your goal is to read one or more readable files, you can try this:
parser.add_argument('file', type=argparse.FileType('r'), nargs='+')
nargs='+' gathers all command line arguments into a list. There must also be one or more arguments or an error message will be generated.
type=argparse.FileType('r') tries to open each argument as a file for reading. It will generate an error message if argparse cannot open the file. You can use this for checking whether the argument is a valid and readable file.
Alternatively, if your goal is to read zero or more readable files, you can simply replace nargs='+' with nargs='*'. This will give you an empty list if no command line arguments are supplied. Maybe you might want to open stdin if you're not given any files - if so just add default=[sys.stdin] as a parameter to add_argument.
And then to process the files in the list:
args = parser.parse_args()
for f in args.file:
for line in f:
# process file...
More about nargs:
https://docs.python.org/2/library/argparse.html#nargs
More about type: https://docs.python.org/2/library/argparse.html#type
Just had to make sure there was at least one argument
parser.add_argument('file',nargs='*')
I am trying to write a python script which accepts optional input parameters plus an input file:
./script --lines 1 file.txt
should take 1 for the number of lines (--lines) and then take file.txt as an input file. However, getopt does not even see "file.txt" since it does not have a parameter name in front of it.
How can I get the filename? I already considered using sys.arv[-1], but this means when I run:
./script --lines 1
then 1 will be taken as the input filename. My script will then throw an error (if no file named 1 exists):
error: file '1' no found
This works, but is not a great solution. Is there a way around this?
Definitely use argparse -- It's included in python2.7+ and can easily be installed for older versions:
parser = argparse.ArgumentParser()
parser.add_argument('--lines', type=int, default=0, help='number of lines')
parser.add_argument('file', help='name of file')
namespace = parser.parse_args()
print namespace.lines
print namespace.file
In a call to getopts:
opts, args = getopts(...)
the second element is a list of arguments not recognized by getopts. In your example, it will be a list containing the single item 'file.txt'.
I am trying to test a simple code that reads a file line-by-line with Pycharm.
for line in sys.stdin:
name, _ = line.strip().split("\t")
print name
I have the file I want to input in the same directory: lib.txt
How can I debug my code in Pycharm with the input file?
You can work around this issue if you use the fileinput module rather than trying to read stdin directly.
With fileinput, if the script receives a filename(s) in the arguments, it will read from the arguments in order. In your case, replace your code above with:
import fileinput
for line in fileinput.input():
name, _ = line.strip().split("\t")
print name
The great thing about fileinput is that it defaults to stdin if no arguments are supplied (or if the argument '-' is supplied).
Now you can create a run configuration and supply the filename of the file you want to use as stdin as the sole argument to your script.
Read more about fileinput here
I have been trying to find a way to use reading file as stdin in PyCharm.
However, most of guys including jet brains said that there is no way and no support, it is the feature of command line which is not related PyCharm itself.
* https://intellij-support.jetbrains.com/hc/en-us/community/posts/206588305-How-to-redirect-standard-input-output-inside-PyCharm-
Actually, this feature, reading file as stdin, is somehow essential for me to ease giving inputs to solve a programming problem from hackerank or acmicpc.
I found a simple way. I can use input() to give stdin from file as well!
import sys
sys.stdin = open('input.in', 'r')
sys.stdout = open('output.out', 'w')
print(input())
print(input())
input.in example:
hello world
This is not the world ever I have known
output.out example:
hello world
This is not the world ever I have known
You need to create a custom run configuration and then add your file as an argument in the "Script Parameters" box. See Pycharm's online help for a step-by-step guide.
However, even if you do that (as you have discovered), your problem won't work since you aren't parsing the correct command line arguments.
You need to instead use argparse:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("filename", help="The filename to be processed")
args = parser.parse_args()
if args.filename:
with open(filename) as f:
for line in f:
name, _ = line.strip().split('\t')
print(name)
For flexibility, you can write your python script to always read from stdin and then use command redirection to read from a file:
$ python myscript.py < file.txt
However, as far as I can tell, you cannot use redirection from PyCharm as Run Configuration does not allow it.
Alternatively, you can accept the file name as a command-line argument:
$ python myscript.py file.txt
There are several ways to deal with this. I think argparse is overkill for this situation. Alternatively, you can access command-line arguments directly with sys.argv:
import sys
filename = sys.argv[1]
with open(filename) as f:
for line in f:
name, _ = line.strip().split('\t')
print(name)
For robust code, you can check that the correct number of arguments are given.
Here's my hack for google code jam today, wish me luck. Idea is to comment out monkey() before submitting:
def monkey():
print('Warning, monkey patching')
global input
input = iter(open('in.txt')).next
monkey()
T = int(input())
for caseNum in range(1, T + 1):
N, L = list(map(int, input().split()))
nums = list(map(int, input().split()))
edit for python3:
def monkey():
print('Warning, monkey patching')
global input
it = iter(open('in.txt'))
input = lambda : next(it)
monkey()
I have this code:
#! /usr/bin/python
import sys, string
def findAll(search, fh):
count = 0
for line in fh:
count += 1
if line.find(search) != -1:
print "%3d: %s"%(count, line.rstrip())
return count
search = raw_input("Enter string to be found: ")
filename = raw_input("Enter filename: ")
fh = open(filename, "rU")
findAll(search, fh)
My professor recommended I write this code and incorporate "improved usage".
I'm confused as to how but she recommended that
I modify the program by commenting out the raw_input() statements, then adding statements to check if the program is invoked with fewer than 2 arguments and if so to print 'Usage: findstring.py string filename. The code takes strings and locates them in a file.
I use the filename command line argument from sys.argv to open the file and prepare for an input/output error (IOError) to occur. Then to use a try-except block to encode what to do if opening the file works or not.
If the opening fails, I print 'Error: cannot open findstring.py where findstring.py is also the considered the text file.
To be honest... I was so busy writing down her suggestions that I had no idea how to do many of the things she recommended. Can someone help improve this code? I'm confused and I don't know how to do this. My prof did say that the code would run, but I don't know how to modify it.
For improved usage, try using the argparse module. It makes taking command-line options easier.
http://docs.python.org/library/argparse.html#module-argparse
A code sample from the above link reads:
import argparse
parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs='+',
help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
const=sum, default=max,
help='sum the integers (default: find the max)')
args = parser.parse_args()
print args.accumulate(args.integers)
Now think about how you might modify this sample for your assignment. You need to take strings (search term, filename) instead of integers.
For the try/except block, remember that the code to handle an error goes in the except portion of the block. That is, you might consider showing an error message in the except block.
Currently, I have a script which does the following. If I have text file with the lines:
<<Name>> is at <<Location>>.
<<Name>> is feeling <<Emotion>>.
The script will take in the this input file as a command line argument and prompt the user for the variables:
Name? Bob
Location? work
Emotion? frustrated
Note that name is only asked once. The script also takes in an output file as an argument and will place in the file the following.
Bob is at work.
Bob is feeling frustrated.
Now I am trying to extend the script so that I can input the variables from the command line (as if I already knew what it was going to ask). So the command would be like (in this case):
python script.py infile outfile Bob work frustrated
And it would generate the same output. Ideally, the extension should prompt the user for remaining variables if there are more remaining after those put in the command line. So if I run the script as:
python script.py infile outfile Bob work
The script would still prompt:
Emotion?
Excess variables in the command line would be ignored. I am pretty new to Python, so while this seems pretty simple, I haven't had success updating my current script with this add-on. Attached is the script:
import argparse
from os import path
import re
replacements = {}
pattern = '<<([^>]*)>>'
def user_replace(match):
## Pull from replacements dict or prompt
placeholder = match.group(1)
if placeholder in replacements:
return replacements[placeholder]
## .setdefault(key, value) returns the value if present, else sets it then returns
return replacements.setdefault(placeholder, raw_input('%s? ' % placeholder))
def main():
parser = argparse.ArgumentParser()
parser.add_argument('infile', type=argparse.FileType('r'))
parser.add_argument('outfile', type=argparse.FileType('w'))
args = parser.parse_args()
matcher = re.compile(pattern)
for line in args.infile:
new_line = matcher.sub(user_replace, line)
args.outfile.write(new_line)
args.infile.close()
args.outfile.close()
if __name__ == '__main__':
main()
Edit: The input file example above is arbitrary. In actuality, the input file could have any number of variables, repeated any number of times.
OK, if you want to dynamically generate the options:
parser = argparse.ArgumentParser()
parser.add_argument('infile', type=argparse.FileType('r'))
parser.add_argument('outfile', type=argparse.FileType('w'))
required, extra = parser.parse_known_args()
infile, outfile = required.infile, required.outfile
args = re.findall(pattern, infile.read())
infile.seek(0)
parser = argparse.ArgumentParser()
for arg in args:
parser.add_argument('--' + arg.lower())
replacements = vars(parser.parse_args(extra))
This will give you a dictionary of all the arguments. Each argument value will be a string inside a list. Then you just do
def user_replace(match):
"""Pull from replacements dict or prompt"""
placeholder = match.group(1)
return (replacements[placeholder][0]
if placeholder in replacements else
raw_input('%s? ' % placeholder))
Edit: I've edited the code at the top to set the arguments. This way, the arguments are optional, and you can have any number. Their names will still be name, location, and emotion in replacements.
The user can now do:
python script.py infile outfile --name Bob --emotion 'extremely frustrated'
leaving out the ones they want to be prompted for, and enclosing strings with spaces in quotes.
Edit 2: Edited the top part so it dynamically gets the args from the textfile.