python3 stdin/file object with For Loop - python

Okay, So I wanted to dynamically either take input from STDIN or from a file, depending on the options given on the command line. In the end I came up with this code:
# Process command-line options
# e.g., python3 trowley_FASTAToTab -i INFILE -o OUTFILE -s
try:
opts, args = getopt.getopt(sys.argv[1:], 'i:o:s')
except getopt.GetoptError as err:
# Redirect STDERR to STDOUT (insures screen display)
sys.stdout = sys.stderr
# Print help information
print(str(err))
# Print usage information
usage()
# Exit
sys.exit(2)
# Define our variables
inFile = "" # File to be read, if there is one
outFile = "" # Outfile to write to, if there is one
keepSeq = False # Whether or not to keep the sequence
header = "" # The header line we will mess with
sequence = "" # The sequence were messing with
# Parse command line options
for (opt, arg) in opts:
if opt == "-i":
inFile = str(arg)
elif opt == "-o":
outFile = str(arg)
elif opt == "-s":
keepSeq = True
# Lets open our outfile or put the variable to stdout
if not outFile:
outFile = sys.stdout
else:
outFile = open(outFile, 'w')
for line in sys.stdin:
# Do tons of stuff here
print(tmpHeader, file=outFile)
# Maybe some cleanup here
Which has some odd behavour. If i specify an infile, but no outfile, it will read that file, do the stuff, and then output the result to the screen (what it's supposed to do).
If I leave off both the infile and the outfile (so input would be from stdin), once i press ctrl d (end of input), it just does nothing, exits the script. Same thing when I take the input from stdin and write to a file, just wont do anything.
I ended up fixing it by using:
while (1):
line = inFile.readline()
if not line:
break
# Do all my stuff here
My question is, Why wont the for line in inFile work? I get no errors, just nothing happens. Is there some unspecified rule i'm breaking?

Related

How to open a file passed in the command line arguments

I have my config encoded here:
#staticmethod
def getConfig(env):
pwd=os.getcwd()
if "win" in (platform.system().lower()):
f = open(pwd+"\config_"+env.lower()+"_data2.json")
else:
f = open(pwd+"/config_"+env.lower()+"_data2.json")
config = json.load(f)
f.close()
return config
#staticmethod
def isWin():
if "win" in (platform.system().lower()):
return True
else:
return False
I have 2 JSON files I want my script to read, but the way it's written above it only reads 1 of them. I want to know how to change it to something like:
f = open(pwd+"\config_"+env.lower()+"_data_f'{}'.json")
so it can read either dataset1.config or dataset2.config. I'm not sure if this is possible, but I want to do that so I can specify which file to run in the command line: python datascript.py -f dataset1.config or python datascript.py -f dataset2.config. Do I assign that entire open() call to a variable?
All you need to do is parse sys.argv to get the argument of the -f flag, then concatenate the strings and pass the result to open(). Try this:
import sys
### ... more code ...
#staticmethod
def getConfig(env):
pwd = os.getcwd()
file = None
try:
file = sys.argv[sys.argv.index('-f')+1]
except ValueError:
file = "data2.json"
if "win" in (platform.system().lower()):
f = open(pwd+"\config_"+env.lower()+"_" + file)
else:
f = open(pwd+"/config_"+env.lower()+"_" + file)
config = json.load(f)
f.close()
return config
sys.argv.index('-f') gives the index of -f in the command line arguments, so the argument must be filename. The try-except statement will provide a default value if no -f argument is given.

Check for empty argument Python

I'm having two problems with arguments in my program, the first problem is that I'm trying to print an error if no arguments are passed to the program and also I'm trying to instead having to use -n which stands for 'no argument' to actually not have to pass any argument to load the file into the program, I want it to just run like python3 program.py file file2 file3 instead of using python3 -n file file2 file3 etc.. I have commented out what I tried to check for the argument if the argument is just the program file [0] to exit
def main():
script = sys.argv[0]
action = sys.argv[1]
noargfile = sys.argv[1:]
filenames = sys.argv[2:]
OutContent = filenames or noargfile
#Load files with arguments -d & --default
print("Loading Files....", sys.argv[1:])
for arg in filenames:
try:
myfile = open(arg, "r")
fileContent = myfile.readlines()
myfile.close()
OutContent = OutContent + fileContent
#if len(sys.argv) == script:
#print("No Argument")
#sys.exit(0)
if action == '--default':
counter = 0 # set a counter to 0
for line in OutContent: #for each line in load if the " 200 " is found add 1 to the counter and repeat until done.
if re.findall(r"\s\b200\b\s", line):
counter += 1
print("\nTotal of (Status Code) 200 request:", counter)
elif action == '-d':
counter = 0 # set a counter to 0
for line in OutContent: #for each line in load if the " 200 " is found add 1 to the counter and repeat until done.
if re.findall(r"\s\b200\b\s", line):
counter += 1
print("\nTotal of (Status Code) 200 request:", counter)
elif action == '-n':
menu(arg, OutContent)
except OSError:
print("File could not be opened " + filenames)
if __name__ == "__main__":
main()
I get an index out of range error, I don't understand why
File "program.py", line 161, in main
action = sys.argv[1]
IndexError: list index out of range
Add this to be first line in function main:
if len(sys.argv)==1: sys.exit("error here")
You shouldn't doing argument parsing yourself when there are already very good argument parses out there (there are probably 100 on pypy)
This little example uses argparse module. It takes n number of files and stores it as a list of strings in the variable files
import argparse
parser = argparse.ArgumentParser(description='Load some files')
parser.add_argument('-f','--files', dest='files', nargs='+', help='<Required> Set flag', required=True)
args = parser.parse_args()
print args.files
Usage:
python myscript -f test1.txt test2.txt test3.txt
Here are more details on how to add more functionalities like help pages or make required|optional fields. https://docs.python.org/2/library/argparse.html

Read a line from a file in python

I have one file named mcelog.conf and I am reading this file in my code. Contents of the file are
no-syslog = yes # (or no to disable)
logfile = /tmp/logfile
Program will read the mcelog.conf file and will check for the no-syslog tag, if no-syslog = yes then program has to check for the tag logfile and will read the logfile tag. Can anyone let me know how I can get the value /tmp/logfile
with open('/etc/mcelog/mcelog.conf', 'r+') as fp:
for line in fp:
if re.search("no-syslog =", line) and re.search("= no", line):
memoryErrors = readLogFile("/var/log/messages")
mcelogPathFound = true
break
elif re.search("no-syslog =", line) and re.search("= yes", line):
continue
elif re.search("logfile =", line):
memoryErrors = readLogFile(line) # Here I want to pass the value "/tmp/logfile" but currently "logfile = /tmp/logfile" is getting passed
mcelogPathFound = true
break
fp.close()
You can just split the line to get the value you want:
line.split(' = ')[1]
However, you might want to look at the documentation for configparser module.
Change the code to:
with open('/etc/mcelog/mcelog.conf', 'r+') as fp:
for line in fp:
if re.search("no-syslog =", line) and re.search("= no", line):
memoryErrors = readLogFile("/var/log/messages")
mcelogPathFound = true
break
elif re.search("no-syslog =", line) and re.search("= yes", line):
continue
elif re.search("logfile =", line):
emoryErrors = readLogFile(line.split("=")[1].strip()) # Here I want to pass the value "/tmp/logfile" but currently "logfile = /tmp/logfile" is getting passed
mcelogPathFound = true
break
fp.close()
This is because you want to read only a part of the line rather the whole thing so I have just split it up by the "=" sign and then stripped it to remove any blanks
I liked the suggestion of the configparser module, so here is an example of that (Python 3)
For the given input, it will output reading /var/log/messages
import configparser, itertools
config = configparser.ConfigParser()
filename = "/tmp/mcelog.conf"
def readLogFile(filename):
if filename:
print("reading", filename)
else:
raise ValueError("unable to read file")
section = 'global'
with open(filename) as fp:
config.read_file(itertools.chain(['[{}]'.format(section)], fp), source = filename)
no_syslog = config[section]['no-syslog']
if no_syslog == 'yes':
logfile = "/var/log/messages"
elif no_syslog == 'no':
logfile = config[section]['logfile']
if logfile:
mcelogPathFound = True
memoryErrors = readLogFile(logfile)

I have list of adresses which i need to nslookup and send to csv

I am trying to do nslookup for addresses in my adrese.txt file and I would like to save them as .csv. Currently my biggest problem is that it only does nslookup for only one address and not all. It just exits with 0 and in my file there is only one adress. I am new to python and got no idea how to fix it. Also replacing .txt with csv in output file would be nice too.
edit: adress getting from text file works, second part is the problem, don't know why
import subprocess
f = open("adrese.txt")
next = f.read()
ip=[]
while next != "":
ip.append(next)
next = f.read()
file_ = open('nslookup.txt', 'w')
for i in ip:
process = subprocess.Popen(["nslookup", i], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output = process.communicate()[0]
file_.write(output)
file_.close()
The reason why its doing this is because while next != "" is not doing what you want it to.
Instead, consider this:
import subprocess
with open('adrese.txt') as i, open('nslookup.txt', 'w') as o:
for line in i:
if line.strip(): # skips empty lines
proc = subprocess.Popen(["nslookup", line.strip()],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
o.write('{}\n'.format(proc.communicate()[0]))
print('Done')
Your are actually not looping through all entries in your adrese.txt
ip = []
f = open("adrese.txt")
for line in f:
ip.append(line)
f.close()
file_ = open('nslookup.txt', 'w')
for i in ip:
process = subprocess.Popen(["nslookup", i], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output = process.communicate()[0]
file_.write(output)
file_.close()
You can use check_call and redirect the stdout directly to a file:
import subprocess
with open('adrese.txt') as f, open('nslookup.txt', 'w') as out:
for line in map(str.rstrip, f):
if line: # skips empty lines
subprocess.check_call(["nslookup", line],
stdout=out)
You never use stderr so there is no point capturing it, if there are any non zero exit status you can catch a CalledProcessError:
import subprocess
with open('adrese.txt') as f, open('nslookup.txt', 'w') as out:
for line in map(str.rstrip, f):
if line: # skips empty lines
try:
subprocess.check_call(["nslookup", line],
stdout=out)
except subprocess.CalledProcessError:
pass

Mafft only creating one file with Python

So I'm working on a project to align a sequence ID and its code. I was given a barcode file, which contains a tag for a DNA sequence, i.e. TTAGG. There's several tags (ATTAC, ACCAT, etc.) which then get removed from the a sequence file and placed with a seq ID.
Example:
sequence file --> SEQ 01 TTAGGAACCCAAA
barcode file --> TTAGG
the output file I want will remove the barcode and use it to create a new fasta format file.
Example:
testfile.TTAGG which when opened should have
>SEQ01
AACCCAAA
There are several of these files. I want to take each one of this files that I create and run them through mafft, but when I run my script, it only concentrates on one file for mafft. The files I mentioned above come out ok, but when mafft runs, it only runs the last file created.
Here's my script:
#!/usr/bin/python
import sys
import os
fname = sys.argv[1]
barcodefname = sys.argv[2]
barcodefile = open(barcodefname, "r")
for barcode in barcodefile:
barcode = barcode.strip()
outfname = "%s.%s" % (fname, barcode)
outf = open(outfname, "w+")
handle = open(fname, "r")
mafftname = outfname + ".mafft"
for line in handle:
newline = line.split()
seq = newline[0]
brc = newline[1]
potential_barcode = brc[:len(barcode)]
if potential_barcode == barcode:
outseq = brc[len(barcode):]
barcodeseq = ">%s\n%s\n" % (seq,outseq)
outf.write(barcodeseq)
handle.close()
outf.close()
cmd = "mafft %s > %s" % (outfname, mafftname)
os.system(cmd)
barcodefile.close()
I hope that was clear enough! Please help! I've tried changing my indentations, adjusting when I close the file. Most of the time it won't make the .mafft file at all, sometimes it does but doesn't put anything it, but mostly it only works on that last file created.
Example:
the beginning of the code creates files such as -
testfile.ATTAC
testfile.AGGAC
testfile.TTAGG
then when it runs mafft it only creates
testfile.TTAGG.mafft (with the correct input)
I have tried close the outf file and then opening it again, in which it tells me I'm coercing it.
I've changed to the outf file to write only, doesn't change anything.
The reason why mafft only aligns the last file file is because its execution is outside the loop.
As your code stands, you create an input file name variable (outfname) in each iteration of the loop, but this variable is always overwritten in the next iteration. Therefore, when your code eventually reaches the mafft execution command, the outfname variable will contain the last file name of the loop.
To correct this, simply insert the mafft execution command inside the loop:
#!/usr/bin/python
import sys
import os
fname = sys.argv[1]
barcodefname = sys.argv[2]
barcodefile = open(barcodefname, "r")
for barcode in barcodefile:
barcode = barcode.strip()
outfname = "%s.%s" % (fname, barcode)
outf = open(outfname, "w+")
handle = open(fname, "r")
mafftname = outfname + ".mafft"
for line in handle:
newline = line.split()
seq = newline[0]
brc = newline[1]
potential_barcode = brc[:len(barcode)]
if potential_barcode == barcode:
outseq = brc[len(barcode):]
barcodeseq = ">%s\n%s\n" % (seq,outseq)
outf.write(barcodeseq)
handle.close()
outf.close()
cmd = "mafft %s > %s" % (outfname, mafftname)
os.system(cmd)
barcodefile.close()

Categories