passing variables to bowtie from python - python

I want to pass an input fasta file stored in a variable say inp_a from python to bowtie and write the output into another out_a. I want to use
os.system ('bowtie [options] inp_a out_a')
Can you help me out

Your question asks for two things, as far as I can tell: writing data to disk, and calling an external program from within Python. Without more detailed requirements, here's what I would write:
import subprocess
data_for_bowtie = "some genome data, lol"
with open("input.fasta", "wb") as input_file:
input_file.write(data_for_bowtie)
subprocess.call(["bowtie", "input.fasta", "output.something"])
There are some fine details here which I have assumed. I'm assuming that you mean bowtie, the read aligner. I'm assuming that your file is a binary, non-human-readable one (which is why there's that b in the second argument to open) and I'm making baseless assumptions about how to call bowtie on the command line because I'm not motivated enough to spend the time learning it.
Hopefully, that provides a starting point. Good luck!

Related

Reading results of gurobi optimisation ("results.sol") in new python script

I am trying to run a rolling horizon optimisation where I have multiple optimisation scripts, each generating their own results. Instead of printing results to screen at every interval, I want to write each of the results using model.write("results.sol") - and then read them back into a results processing script (separate python script).
I have tried using read("results.sol") using Python, but the file format is not recognised. Is there any way that you can read/process the .sol file format that Gurobi outputs? It would seem bizarre if you cannot read the .sol file at some later point and generate plots etc.
Maybe I have missed something blindingly obvious.
Hard to answer without seeing your code as we have to guess what you are doing.
But well...
When you use
model.write("out.sol")
Gurobi will use it's own format to write it (and what is written is inferred from the file-suffix).
This can easily be read by:
model.read("out.sol")
If you used
x = read("out.sol")
you are using python's basic IO-tools and of course python won't interpret that file in respect to the format. Furthermore reading like that is text-mode (and maybe binary is required; not sure).
General rule: if you wrote the solution using a class-method of class model, then read using a class-method of class model too.
The usage above is normally used to reinstate some state of your model (e.g. MIP-start). If you want to plot it, you will have to do further work. In this case, using python's IO tools might be a good idea and you should respect the format described here. This could be read as csv or manually (and opposed to my remark earlier: it is text-mode; not binary).
So assuming the example from the link is in file gur.sol:
import csv
with open('gur.sol', newline='\n') as csvfile:
reader = csv.reader((line.replace(' ', ' ') for line in csvfile), delimiter=' ')
next(reader) # skip header
sol = {}
for var, value in reader:
sol[var] = float(value)
print(sol)
Output:
{'z': 0.2, 'x': 1.0, 'y': 0.5}
Remarks:
Code is ugly because python's csv module has some limitations
Delimiter is two-spaces in this format and we need to hack the code to read it (as only one character is allowed in this function)
Code might be tailored to python 3 (what i'm using; probably the next() method will be different in py2)
pandas would be much much better for this purpose (huge tool with a very good csv_reader)

STDIN file read query [duplicate]

This question already has answers here:
How do I read from stdin?
(25 answers)
Closed 6 years ago.
I am doing small project in which I have to read the file from STDIN.
I am not sure what it means, what I asked the professor he told me,
there is not need to open the file and close like we generally do.
sFile = open ( "file.txt",'r')
I dont have to pass the file as a argument.
I am kind of confused what he wants.
The stdin takes input from different sources - depending on what input it gets.
Given a very simple bit of code for illustration (let's call it: script.py):
import sys
text = sys.stdin.read()
print text
You can either pipe your script with the input-file like so:
$ more file.txt | script.py
In this case, the output of the first part of the pipeline - which is the content of the file - is assigned to our variable(in this case text, which gets printed out eventually).
When left empty (i.e. without any additional input) like so:
$ python script.py
It let's you write the input similar to the input function and assigns the written input to the defined variable(Note that this input-"window" is open until you explicitly close it, which is usually done with Ctrl+D).
import sys, then sys.stdin will be the 'file' you want which you can use like any other file (e.g. sys.stdin.read()), and you don't have to close it. stdin means "standard input".
Might be helpful if you read through this post, which seems to be similar to yours.
'stdin' in this case would be the argument on the command line coming after the python script, so python script.py input_file. This input_file would be the file containing whatever data you are working on.
So, you're probably wondering how to read stdin. There are a couple of options. The one suggested in the thread linked above goes as follows:
import fileinput
for line in fileinput.input():
#read data from file
There are other ways, of course, but I think I'll leave you to it. Check the linked post for more information.
Depending on the context of your assignment, stdin may be automatically sent into the script, or you may have to do it manually as detailed above.

subprocess, Popen, and stdin: Seeking practical advice on automating user input to .exe

Despite my obviously beginning Python skills, I’ve got a script that pulls a line of data from a 2,000-row CSV file, reads key parameters, and outputs a buffer CSV file organized as an N-by-2 rectangle, and uses the subprocess module to call the external program POVCALLC.EXE, which takes a CSV file organized that way as input. The relevant portion of the code is shown below. I THINK that subprocess or one of its methods should allow me to interact with the external program, but am not quite sure how - or indeed whether this is the module I need.
In particular, when POVCALLC.EXE starts it first asks for the input file, which in this case is buffer.csv. It then asks for several additional parameters including the name of an output file, which come from outside the snippet below. It then starts computing results, and then ask for further user input, including several carriage returns . Obviously, I would prefer to automate this interaction for the 2,000 rows in the original CSV.
Am I on the right track with subprocess, or should I be looking elsewhere to automate this interaction with the external executable?
Many thanks in advance!
# Begin inner loop to fetch Lorenz curve data for each survey
for i in range(int(L_points_number)):
index = 3 * i
line = []
P = L_points[index]
line.append(P)
L = L_points[index + 1]
line.append(L)
with open('buffer.csv', 'a', newline='') as buffer:
writer = csv.writer(buffer, delimiter=',')
P=1
line.append(P)
L=1
line.append(L)
writer.writerow(line)
subprocess.call('povcallc.exe')
# TODO: CALL povcallc and compute results
# TODO: USE Regex to interpret results and append them to
# output file
If your program expects these arguments on the standard input (e.g. after running POVCALLC you type csv filenames into the console), you could use subprocess.Popen() [see https://docs.python.org/3/library/subprocess.html#subprocess.Popen ] with stdin redirection (stdin=PIPE), and use the returned object to send data to stdin.
It would looks something like this:
my_proc = subprocess.Popen('povcallc.exe', stdin=subprocess.PIPE)
my_proc.communicate(input="my_filename_which_is_expected_by_the_program.csv")
You can also use the tuple returned by communicate to automatically check the programs stdout and stderr (see the link to docs for more).

modify value of variable in python program from batch file

I have a python file (example.py) to run, which contains 3 variables.
I have three .txt files (var1.txt, var2.txt, var3.txt) which contains the values of each of these 3 variables.
I want to write a batch file (bfile.bat) that
retrieves the three values from the text files,
pass these values to the variables in the python file,
and then run.
Any help would be appreciated!
-added-
Thank you.
the thing is, there is no particular reason for the three files.
So a coworker of mine wrote this program in python (which I am new too),
I'll say this program's name is "program.py",
and she also made a file to show how it works,
by setting the values of variables used in the program.py;
which is "example.py".
this example.py containes the values of the variables which are used in the program.py.
But I want to use many values for these variables,
so I wanted to make a batch file for this - which apparently is not a good idea.
So the example.py looks something like this:
SOUND = 'sound.wav'
TIMESTEP = 50
END = 500
program(SOUND, TIMESTEP, END)
and I want to change the values of SOUND, TIMESTEP, END.
please help me!
Thank you!
You should consider using input parameters in your python file, making use of sys.argv. Then use bash output pipelining to pipe your information to your python script. However, I do not recommend modifying the python file from bash, by writing to it.
[Also, why is it not possible to read the files with python?]
//EDit: Regarding the new information.
So when you already got a python script, this is the best thing for you to happen, as you now have multiple ways of dealing with this.
First of all, a python script can import another python script, or just parts of it. So what you can do is import <function> from program or import program and then use it. Now you can write your own python script, using her function!
You can simple create a list of your parameters and values. For example a list of tuples:
import program
# of course you can also generate this list, depeding on which couples of parameters
# you want to run :)
# this was hardcoding it for simplicity!
myparams = [('sound1.wav', 30, 50), ('sound2.wav', 20, 100), ...]
for (p1, p2, p3) in myparams:
program.function(p1, p2, p3)
This will use the function() out of your program.py.
You could also import the program and write your own script using it with sys.argv so you could run it from bash using command line parameters python myprogram.py p1 p2 p3.
However, looking at the question, you seem to be working with Windows and writing a simple python script is probably better/easier/more convenient. But that depends on your experience on that matter :)
Cheers!
It sounds like passing arguments would be an easier way to accomplish what you want. So instead of using files to hold the variables, try:
import sys
numArgs = len(sys.argv)
# You want 3 variables, and the 0th index will be '[file].py'.
if (numArgs >= 4):
SOUND = sys.argv[1]
TIMESTEP = sys.argv[2]
END = sys.argv[3]
else:
print 'Not enough arguments.'
Then you can simply run:
python [file].py arg1 arg2 arg3

Python: Read huge number of lines from stdin

I'm trying to read a huge amount of lines from standard input with python.
more hugefile.txt | python readstdin.py
The problem is that the program freezes as soon as i've read just a single line.
print sys.stdin.read(8)
exit(1)
This prints the first 8 bytes but then i expect it to terminate but it never does. I think it's not really just reading the first bytes but trying to read the whole file into memory.
Same problem with sys.stdin.readline()
What i really want to do is of course to read all the lines but with a buffer so i don't run out of memory.
I'm using python 2.6
This should work efficiently in a modern Python:
import sys
for line in sys.stdin:
# do something...
print line,
You can then run the script like this:
python readstdin.py < hugefile.txt
Back in the day, you had to use xreadlines to get efficient huge line-at-a-time IO -- and the docs now ask that you use for line in file.
Of course, this is of assistance only if you're actually working on the lines one at a time. If you're just reading big binary blobs to pass onto something else, then your other mechanism might be as efficient.

Categories