I have two text files which I'm trying to work with in python 2.7.7, structured as in these examples:
sequence_file.txt:
MKRPGGAGGGGGSPSLVTMANSSDDGYGGVGMEAEGDVEEEMMACGGGGE
positions.txt
10
7
4
What I want to do is insert a # symbol into the sequence at every position indicated in positions.txt:
MKR#PGG#AGGG#GGSPSLVTMANSSDDGYGGVGMEAEGDVEEEMMACGGGGE
At the moment, my code is as follows:
# Open sequence file, remove newlines:
with open ("sequence_file.txt", "r") as seqfile:
seqstring=seqfile.read().replace('\n', '').replace('\r', '')
# Turn sequence into list
seqlist = list(sequence)
# Open positions.txt, and use each line as a parameter for the insert() function.
with open("positions.txt") as positions:
for line in positions:
insertpoint = line.rstrip('\n')
seqlist.insert(insertpoint, '#')
seqlist = list(sequence)
The last block of that code is where it falls down. I'm trying to have it read the first line, trim the newline character (\n) and then use that line as a variable (insertpoint) in the insert() command. However, whenever I try this it tells me:
Traceback (most recent call last):
File "<pyshell#8>", line 4, in <module>
seqlist.insert(insertpoint, '#')
TypeError: an integer is required
If I test it out and try 'print insertpoint' it produces the number correctly, and so my interpretation of the error is that when I use the insert() command it is reading 'insertpoint' as text rather than the variable that was just set.
Can anyone suggest what might be going wrong with this?
What happens is that str.rstrip() returns a string, but insert() expects an integer.
Solution: Convert that string into an integer:
insertpoint = int(line.rstrip('\n'))
Note: When you print insertpoint it is shown without the '' but it is a string. You can check this by printing its type:
print(type(insertpoint)) # <type 'str'>
It appears you might need to put int() around insertpoint:
seqlist.insert(int(insertpoint), '#')
Related
I want to print specified output in subrocess
Here is my code:
from subprocess import check_output
output = check_output(['python3', 'code.py']).decode('ascii')
print(output)
The output is:
Tom
John
How can I print just Tom or just John instead of both of them?
I have tried print(output[0]) to print Tom but I get only T.
You have single string and you can use any string's function.
You can split it and create list with lines
lines = output.split('\n')
And then display only first line
print(lines[0])
Let's take a look on steps you've already done:
You call check_output() and it returns output in the form of bytes;
Then You call bytes.decode(), which returns str.
As a result you get multi-line string. You've tried to access to first line using index 0, but you got first char instead of first line. It happened, cause accessing to string by index will return you char from this index.
To get first line you should split lines of your multi-line string (convert str to list of str). There's built-in function str.splitlines() which does what you need.
So, to upgrade your code we need to add one more line before your print() statement:
output_lines = output.splitlines()
After that you can access to line by index:
print(output_lines[0])
#!/usr/bin/env python
import os,subprocess,re
f=open("/var/tmp/disks_out","w")
proc=subprocess.Popen(['df', '-h'],stdout=subprocess.PIPE,stderr=subprocess.PIPE,shell=False)
out,err=proc.communicate()
for line in out:
f.write(line)
f.close()
f1=open("/var/tmp/disks_out","r")
disks=[]
for line in f1:
m=re.search(r'(c.*s0)',line)
if m:
disk=m.group(1)
disks.append(disk)
disks = disks[0][:-1]
slices =[disks+i for i in str(range(5))]
print(slices)
and the out put i am getting below:
['c0t5000CCA025A29894d0s0', 'c0t5000CCA025A29894d0s1', 'c0t5000CCA025A29894d0s3', 'c0t5000CCA025A29894d0s4', 'c0t50 00CCA025A29894d0s5', 'c0t5000CCA025A29894d0s6']
But i want to get output similar too:
c0t5000CCA025A29894d0s1,c0t5000CCA025A29894d0s2,c0t5000CCA025A29894d0s3
If you want to get it with commas:
print(','.join(slices))
or rather for python 2.7:
print ','.join(slices)
What you printed out was list, so python interpreted it as one. join() method joins every element of the iterable by passed string, more info here (for Python 2.7 as you put in tag) but it seems you are using Python 3 here.
I have a list name my= ['cbs is down','abnormal']
and I have opened a file in read mode
Now I want to search any of the string available in list that exist in that file and perform the if action
fopen = open("test.txt","r")
my =['cbs is down', 'abnormal']
for line in fopen:
if my in line:
print ("down")
and when I execute it, I get the following
Traceback (most recent call last):
File "E:/python/fileread.py", line 4, in <module>
if my in line:
TypeError: 'in <string>' requires string as left operand, not list
This should work things out:
if any(i in line for i in my):
...
Basically you are going through my and checking whether any of its elements is present in line.
fopen = open("test.txt","r")
my =['cbs is down', 'abnormal']
for line in fopen:
for x in my:
if x in line:
print ("down")
Sample input
Some text cbs is down
Yes, abnormal
not in my list
cbs is down
Output
down
down
down
The reason for your error:
The in operator as used in:
if my in line: ...
^ ^
|_ left | hand side
|
|_ right hand side
for a string operand on the right side (i.e. line) requires a corresponding string operand on the left hand side. This operand consistency check is implemented by the str.__contains__ method, where the call to __contains__ is made from the string on the right hand side (see cpython implemenetation). Same as:
if line.__contains__(my): ...
You're however passing a list, my, instead of a string.
An easy way to resolve this is by check that any of the items in the list are contained in the current line using the builtin any function:
for line in fopen:
if any(item in line for item in my):
...
Or since you have just two items use the or operator (pun unintended) which short-circuits in the same way as any:
for line in fopen:
if 'cbs is down' in line or 'abnormal' in line:
...
You could also join the terms in my to a regular expression like \b(cbs is down|abnormal)\b and use re.findall or re.search to find the terms. This way, you can also enclose the pattern in word-boundaries \b...\b so it does not match parts of longer words, and you also see which term was matched, and where.
>>> import re
>>> my = ['cbs is down', 'abnormal']
>>> line = "notacbs is downright abnormal"
>>> p = re.compile(r"\b(" + "|".join(map(re.escape, my)) + r")\b")
>>> p.findall(line)
['abnormal']
>>> p.search(line).span()
(21, 29)
I'm using Python 3 and I need to parse a line like this
-1 0 1 0 , -1 0 0 1
I want to split this into two lists using Fraction so that I can also parse entries like
1/2 17/12 , 1 0 1 1
My program uses a structure like this
from sys import stdin
...
functions'n'stuff
...
for line in stdin:
and I'm trying to do
for line in stdin:
X = [str(elem) for elem in line.split(" , ")]
num = [Fraction(elem) for elem in X[0].split()]
den = [Fraction(elem) for elem in X[1].split()]
but all I get is a list index out of range error: den = [Fraction(elem) for elem in X[1].split()]
IndexError: list index out of range
I don't get it. I get a string from line. I split that string into two strings at " , " and should get one list X containing two strings. These I split at the whitespace into two separate lists while converting each element into Fraction. What am I missing?
I also tried adding X[-1] = X[-1].strip() to get rid of \n that I get from ending the line.
The problem is that your file has a line without a " , " in it, so the split doesn't return 2 elements.
I'd use split(',') instead, and then use strip to remove the leading and trailing blanks. Note that str(...) is redundant, split already returns strings.
X = [elem.strip() for elem in line.split(",")]
You might also have a blank line at the end of the file, which would still only produce one result for split, so you should have a way to handle that case.
With valid input, your code actually works.
You probably get an invalid line, with too much space or even an empty line or so. So first thing inside the loop, print line. Then you know what's going on, you can see right above the error message what the problematic line was.
Or maybe you're not using stdin right. Write the input lines in a file, make sure you only have valid lines (especially no empty lines). Then feed it into your script:
python myscript.py < test.txt
How about this one:
pairs = [line.split(",") for line in stdin]
num = [fraction(elem[0]) for elem in pairs if len(elem) == 2]
den = [fraction(elem[1]) for elem in pairs if len(elem) == 2]
I am trying to parse a particular text file. I am trying to open the text file and line by line ask if a particular string is there (In the following example case its the presence of the number 01 in the curly brackets), then manipulate a particular string either forwards backwards, or keep it the same. Here's that example, with one line named arbitrarily "go"... (other lines in the full file have similar format but have {01}, {00} etc...
go = 'USC_45774-1111-0 <hkxhk> {10} ; 78'
go = go.replace(go[22:24],go[23:21:-1])
>>> go
'USC_45774-1111-0 <khxkh> {10} ; 78'
I am trying to manipulate the first "hk" (go[22:24]) by replacing it with the same letters but backwards (go[23:21:-1).What I want is to see khxhk but as you can see, the result I am getting is that both are turned backwards to khxkh.
I am also having a problem of executing the specific if statement for each line. Many lines that dont have {01} are being manipulated as if they were....
with open('c:/LG 1A.txt', 'r') as rfp:
with open('C:/output5.txt', 'w') as wfp:
for line in rfp.readlines():
if "{01}" or "{-1}" in line:
line = line.replace(line[25:27],line[26:24:-1])
line = line.replace("<"," ")
line = line.replace(">"," ")
line = line.replace("x"," ")
wfp.write(line)
elif "{10}" or "{1-}" in line:
line = line.replace(line[22:24],line[23:21:-1])
line = line.replace("<"," ")
line = line.replace(">"," ")
line = line.replace("x"," ")
wfp.write(line)
elif "{11}" in line:
line = line.replace(line[22:27],line[26:21:-1])
line = line.replace("<"," ")
line = line.replace(">"," ")
line = line.replace("x"," ")
wfp.write(line)
wfp.close()
Am I missing something simple?
The string replace method does not replace characters by position, it replaces them by what characters they are.
>>> 'apple aardvark'.replace('a', '!')
'!pple !!rdv!rk'
So in your first case, you are telling to replace "hk" with "kh". It doesn't "know" that you want to only replace one of the occurrences; it just knows you want to replace "hk" with "kh", so it replaces all occurrences.
You can use the count argument to replace to specify that you only want to replace the first occurrence:
>>> go = 'USC_45774-1111-0 <hkxhk> {10} ; 78'
... go.replace(go[22:24],go[23:21:-1],1)
'USC_45774-1111-0 <khxhk> {10} ; 78'
Note, though, that this will always replace the first occurrence, not necessarily the occurrence at the position in the string you specified. In this case I guess that's what you want, but it may not work directly for other similar tasks. (That is, there is no way to use this method as-is to replace the second occurrence or the third occurrence; you can only replace the first, or the first two, or the first three, etc. To replace the second or third occurrence you'd need to do a bit more.)
As for the second part of your question, you are misunderstanding what if "{01}" or "{-1}" in line means. It means, in layman's terms, if "{01}" or if "{-1}" in line. Since if "{01}" is always true (i.e., the string "{01}" is not a false value), the whole condition is always true. What you want is if "{01}" in line or "{-1}" in line".
I don't know what it is about Python, but your problem is one that gets posted here at least a couple times every day.
if "{01}" or "{-1}" in line:
This doesn't do what you think it does. It asks, "is "{01}" true"? Because it's a non-zero-length string, it is. Because or short-circuits, the rest of the condition is not tested because the first argument is true. Therefore the body of your if statement is always executed.
In other words, Python evaluates as if you'd written this:
if ("{01}") or ("{-1}" in line):
You want something like:
if "{01}" in line or "{-1}" in line:
Or if you have a lot of similar conditions:
if any(x in line for x in ("{01}", "{-1}")):
you can use count argument of replace():
'USC_45774-1111-0 <hkxhk> {10} ; 78'.replace("hk","kh",1)
For your second question, you need change the condition to:
if "{01}" in line or "{-1}" in line:
...