Using subprocess and .format - python

So I have this code so far:
from subprocess import call
a, b = 10000000, 10000100
call('samtools faidx file.fa chr22:{}-{}'.format(a, b), shell = True)
but when I run it, the numbers assigned to a and b does not seem to go into the {} brackets as format should do.
Am I using format wrong here, or is my code itself wrong?
(file.fa is a file that holds a DNA sequence for chromosome 22)

Related

Python store output as a variable after running an executable command

"TMalign..." is an executable file that I used to get data. How could I store the output into a variable so that I could extract target values from the output. The executable file is compiled from a long .cpp, so I do not think I could call the variable names from there.
import sys,os
os.system("./TMalign 3w4u.pdb 6bb5.pdb -u 139") #some command I have
The output is like, and I need to extract the TM-score values:
*********************************************************************
* TM-align (Version 20190822): protein structure alignment *
* References: Y Zhang, J Skolnick. Nucl Acids Res 33, 2302-9 (2005) *
* Please email comments and suggestions to yangzhanglab#umich.edu *
*********************************************************************
Name of Chain_1: 3w4u.pdb (to be superimposed onto Chain_2)
Name of Chain_2: 6bb5.pdb
Length of Chain_1: 141 residues
Length of Chain_2: 139 residues
Aligned length= 139, RMSD= 1.07, Seq_ID=n_identical/n_aligned= 0.590
TM-score= 0.94726 (if normalized by length of Chain_1, i.e., LN=141, d0=4.42)
TM-score= 0.96044 (if normalized by length of Chain_2, i.e., LN=139, d0=4.38)
TM-score= 0.96044 (if normalized by user-specified LN=139.00 and d0=4.38)
(You should use TM-score normalized by length of the reference structure)
(":" denotes residue pairs of d < 5.0 Angstrom, "." denotes other aligned residues)
SLTKTERTIIVSMWAKISTQADTIGTETLERLFLSHPQTKTYFPHFDLHPGSAQLRAHGSKVVAAVGDAVKSIDDIGGALSKLSELHAYILRVDPVNFKLLSHCLLVTLAARFPADFTAEAHAAWDKFLSVVSSVLTEKYR
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::. .
-LSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSK-Y
Total CPU time is 0.03 seconds
Thanks for help!
You should look toward the following approach:
import re
from subprocess import check_output
ret = check_output(['./TMalign', '3w4u.pdb', '6bb5.pdb', '-u', '139'])
tm_scores = []
for line in str(ret).split('\\n'):
if re.match(r'^TM-score=', line):
score = line.split()[1:2] # Extract the value
tm_scores.extend(score) # Saving only values
# tm_scores now contains: ['0.94726', '0.96044', '0.96044']
While it being somewhat elaborate, it is a flexible and tunable solution. Note, if it will be used among other code, it would be better to wrap this into a function.
My function wasn't that smart,I will let ouput write in to a file to do the follow
import os
cmd = './TMalign 3w4u.pdb 6bb5.pdb -u 139'
os.system(cmd + ">> 1.txt")

Print out a specific part of the output result in python

I have a function that does some thing and displays a line of output mixed between integer and strings.
However, I just want to print out the last part of the output which is the number 5 that comes after the dots:
The number 5 is the value of the OID and it could be 5( as On)
Or 6(as OFF).
Is there any way how can I specify that in print or if condition?
Here is the function:
import subprocess, sys
p = subprocess.Popen(["powershell.exe",
"snmpwalk -v1 -c public 192.168.178.213 .1.3.6.1.4.1.9986.3.22.1.6.1.1.15"],
stdout=sys.stdout)
p.communicate()

Stdin redirection from Python

Say I have a program called some_binary that can read data as:
some_binary < input
where input is usually a file in disk. I would like to send input to some_binary from Python without writing to disk.
For example input is typically a file with the following contents:
0 0.2
0 0.4
1 0.2
0 0.3
0 0.5
1 0.7
To simulate something like that in Python I have:
import numpy as np
# Random binary numbers
first_column = np.random.random_integers(0,1, (6,))
# Random numbers between 0 and 1
second_column = np.random.random((6,))
How can I feed the concatenation of first_column and second_column to some_binary as if I was calling some_binary < input from the command line, and collect stdout in a string?
I have the following:
def run_shell_command(cmd,cwd=None,my_input):
retVal = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stdin=my_input, cwd=cwd);
retVal = retVal.stdout.read().strip('\n');
return(retVal);
But I am not sure I am heading in the right direction.
Yes, you are heading in the right direction.
You can use pythons subprocess.check_output() functions which is a convenience wrapper around subprocess.Popen(). The Popen needs more infrastructure. For example you need to call comminucate() on the return value of Popen in order for things to happen.
Something like
output = subprocess.check_output([cmd], stdin = my_input)
should work in your case.

byte reverse AB CD to CD AB with python

I have a .bin file, and I want to simply byte reverse the hex data. Say for instance # 0x10 it reads AD DE DE C0, want it to read DE AD C0 DE.
I know there is a simple way to do this, but I am am beginner and just learning python and am trying to make a few simple programs to help me through my daily tasks. I would like to convert the whole file this way, not just 0x10.
I will be converting at start offset 0x000000 and blocksize/length is 1000000.
here is my code, maybe you can tell me what to do. i am sure i am just not getting it, and i am new to programming and python. if you could help me i would very much appreciate it.
def main():
infile = open("file.bin", "rb")
new_pos = int("0x000000", 16)
chunk = int("1000000", 16)
data = infile.read(chunk)
reverse(data)
def reverse(data):
output(data)
def output(data):
with open("reversed", "wb") as outfile:
outfile.write(data)
main()
and you can see the module for reversing, i have tried many different suggestions and it will either pass the file through untouched, or it will throw errors. i know module reverse is empty now, but i have tried all kinds of things. i just need module reverse to convert AB CD to CD AB.
thanks for any input
EDIT: the file is 16 MB and i want to reverse the byte order of the whole file.
In Python 3.4 you can use this:
>>> data = b'\xAD\xDE\xDE\xC0'
>>> swap_data = bytearray(data)
>>> swap_data.reverse()
the result is
bytearray(b'\xc0\xde\xde\xad')
In Python 2, the binary file gets read as a string, so string slicing should easily handle the swapping of adjacent bytes:
>>> original = '\xAD\xDE\xDE\xC0'
>>> ''.join([c for t in zip(original[1::2], original[::2]) for c in t])
'\xde\xad\xc0\xde'
In Python 3, the binary file gets read as bytes. Only a small modification is need to build another array of bytes:
>>> original = b'\xAD\xDE\xDE\xC0'
>>> bytes([c for t in zip(original[1::2], original[::2]) for c in t])
b'\xde\xad\xc0\xde'
You could also use the < and > endianess format codes in the struct module to achieve the same result:
>>> struct.pack('<2h', *struct.unpack('>2h', original))
'\xde\xad\xc0\xde'
Happy byte swapping :-)
data = b'\xAD\xDE\xDE\xC0'
reversed_data = data[::-1]
print(reversed_data)
# b'\xc0\xde\xde\xad'
Python3
bytes(reversed(b'\xAD\xDE\xDE\xC0'))
# b'\xc0\xde\xde\xad'
Python has a list operator to reverse the values of a list --> nameOfList[::-1]
So, I might store the hex values as string and put them into a list then try something like:
def reverseList(aList):
rev = aList[::-1]
outString = ""
for el in rev:
outString += el + " "
return outString

call python with system() in R to run a python script emulating the python console

I want to pass a chunk of Python code to Python in R with something like system('python ...'), and I'm wondering if there is an easy way to emulate the python console in this case. For example, suppose the code is "print 'hello world'", how can I get the output like this in R?
>>> print 'hello world'
hello world
This only shows the output:
> system("python -c 'print \"hello world\"'")
hello world
Thanks!
BTW, I asked in r-help but have not got a response yet (if I do, I'll post the answer here).
Do you mean something like this?
export NUM=10
R -q -e "rnorm($NUM)"
You might also like to check out littler - http://dirk.eddelbuettel.com/code/littler.html
UPDATED
Following your comment below, I think I am beginning to understand your question better. You are asking about running python inside the R shell.
So here's an example:-
# code in a file named myfirstpythonfile.py
a = 1
b = 19
c = 3
mylist = [a, b, c]
for item in mylist:
print item
In your R shell, therefore, do this:
> system('python myfirstpythonfile.py')
1
19
3
Essentially, you can simply call python /path/to/your/python/file.py to execute a block of python code.
In my case, I can simply call python myfirstpythonfile.py assuming that I launched my R shell in the same directory (path) my python file resides.
FURTHER UPDATED
And if you really want to print out the source code, here's a brute force method that might be possible. In your R shell:-
> system('python -c "import sys; sys.stdout.write(file(\'myfirstpythonfile.py\', \'r\').read());"; python myfirstpythonfile.py')
a = 1
b = 19
c = 3
mylist = [a, b, c]
for item in mylist:
print item
1
19
3
AND FURTHER FURTHER UPDATED :-)
So if the purpose is to print the python code before the execution of a code, we can use the python trace module (reference: http://docs.python.org/library/trace.html). In command line, we use the -m option to call a python module and we specify the options for that python module following it.
So for my example above, it would be:-
$ python -m trace --trace myfirstpythonfile.py
--- modulename: myfirstpythonfile, funcname: <module>
myfirstpythonfile.py(1): a = 1
myfirstpythonfile.py(2): b = 19
myfirstpythonfile.py(3): c = 3
myfirstpythonfile.py(4): mylist = [a, b, c]
myfirstpythonfile.py(5): for item in mylist:
myfirstpythonfile.py(6): print item
1
myfirstpythonfile.py(5): for item in mylist:
myfirstpythonfile.py(6): print item
19
myfirstpythonfile.py(5): for item in mylist:
myfirstpythonfile.py(6): print item
3
myfirstpythonfile.py(5): for item in mylist:
--- modulename: trace, funcname: _unsettrace
trace.py(80): sys.settrace(None)
Which as we can see, traces the exact line of python code, executes the result immediately after and outputs it into stdout.
The system command has an option called intern = FALSE. Make this TRUE and Whatever output was just visible before, will be stored in a variable.
Now run your system command with this option and you should get your output directly in your variable. Like this
tmp <- system("python -c 'print \"hello world\"'",intern=T)
My work around for this problem is defining my own functions that paste in parameters, write out a temporary .py file, and them execute the python file via a system call. Here is an example that calls ArcGIS's Euclidean Distance function:
py.EucDistance = function(poly_path,poly_name,snap_raster,out_raster_path_name,maximum_distance,mask){
py_path = 'G:/Faculty/Mann/EucDistance_temp.py'
poly_path_name = paste(poly_path,poly_name, sep='')
fileConn<-file(paste(py_path))
writeLines(c(
paste('import arcpy'),
paste('from arcpy import env'),
paste('from arcpy.sa import *'),
paste('arcpy.CheckOutExtension("spatial")'),
paste('out_raster_path_name = "',out_raster_path_name,'"',sep=""),
paste('snap_raster = "',snap_raster,'"',sep=""),
paste('cellsize =arcpy.GetRasterProperties_management(snap_raster,"CELLSIZEX")'),
paste('mask = "',mask,'"',sep=""),
paste('maximum_distance = "',maximum_distance,'"',sep=""),
paste('sr = arcpy.Describe(snap_raster).spatialReference'),
paste('arcpy.env.overwriteOutput = True'),
paste('arcpy.env.snapRaster = "',snap_raster,'"',sep=""),
paste('arcpy.env.mask = mask'),
paste('arcpy.env.scratchWorkspace ="G:/Faculty/Mann/Historic_BCM/Aggregated1080/Scratch.gdb"'),
paste('arcpy.env.outputCoordinateSystem = sr'),
# get spatial reference for raster and force output to that
paste('sr = arcpy.Describe(snap_raster).spatialReference'),
paste('py_projection = sr.exportToString()'),
paste('arcpy.env.extent = snap_raster'),
paste('poly_name = "',poly_name,'"',sep=""),
paste('poly_path_name = "',poly_path_name,'"',sep=""),
paste('holder = EucDistance(poly_path_name, maximum_distance, cellsize, "")'),
paste('holder = SetNull(holder < -9999, holder)'),
paste('holder.save(out_raster_path_name) ')
), fileConn, sep = "\n")
close(fileConn)
system(paste('C:\\Python27\\ArcGIS10.1\\python.exe', py_path))
}

Categories