Biopython SeqIO: AttributeError: 'str' object has no attribute 'id' - python

I am trying to filter out sequences using SeqIO but I am getting this error.
Traceback (most recent call last):
File "paralog_warning_filter.py", line 61, in <module>
.
.
.
SeqIO.write(desired_proteins, "filtered.fasta","fasta")
AttributeError: 'str' object has no attribute 'id'
I checked other similar questions but still couldn't understand what is wrong with my script.
Here is the relevant part of the script I am trying:
fh=open('lineageV_paralog_warning_genes.fasta')
for s_record in SeqIO.parse(fh,'fasta'):
name = s_record.id
seq = s_record.seq
for i in paralogs_in_all:
if name.endswith(i):
desired_proteins=seq
output_file=SeqIO.write(desired_proteins, "filtered.fasta","fasta")
output_file
fh.close()
I have a separate paralagos_in_all list and that is the ID source. When I try to print name it returns a proper string id names which are in this format >coronopifolia_tair_real-AT2G35040.1#10.
Can you help me understand my problem? Thanks in advance.

try and let us know (can't test your code ) :
from Bio.SeqRecord import SeqRecord
from Bio import SeqIO
......
.......
desired_proteins = []
fh=open('lineageV_paralog_warning_genes.fasta')
for s_record in SeqIO.parse(fh,'fasta'):
name = s_record.id
seq = s_record.seq
for i in paralogs_in_all:
if name.endswith(i):
# desired_proteins=SeqRecord( Seq(seq), id=name) ### here seq is already a Seq object see below
desired_proteins.append(SeqRecord( seq, id=name, description="")) # description='' removes the <unknown description> that otherwise would be present
output_file=SeqIO.write(desired_proteins, "filtered.fasta","fasta") ## don't know how to have SeqIO.write to append to file instead of re-writing all of it
fh.close()

Related

Getting "TypeError: 'NoneType' object is not subscriptable" with pattern.search

I was working on a python program made to write out the IP addresses from an error log file. I got it coded but whenever I try it on my error log I get a
TypeError: 'NoneType' object is not subscriptable
I wrote the code under this in case anyone wants to see what I did specifically. If anyone knows my mistake that would be great.
import re
with open(r'C:\Users\Admin\Desktop/Test error file.txt') as fh:
file = fh.readlines()
pattern = re.compile(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})')
list=[]
for line in file:
lst.append(pattern.search(line)[0])
print(list)
pattern.search returns None if no match is found. To account for this try
for line in file:
match = pattern.search(line)
if match is not None:
lst.append(match[0])
else:
lst.append(None)
Here I appended a None value if there was no match. You can change the logic to do what you need if there is no match.

Get value from a dictionary into a JSON file

I need to get all the bodyHtml and authorId values from the file that appears here: https://drive.google.com/file/d/10EGOAWsw3G5-ETUryYX7__JPOfNwUsL6/view?usp=sharing
I have tried several ways, but I always find the error of: TypeError: list indices must be integers, not str
I've tried several ways, this is my last code:
# -*- coding: utf-8 -*-
import json
import requests
import datetime
data = json.loads(open('file.json').read())
coments = data['headDocument']['content']['id']
for comment in data['headDocument']['content']['content']['bodyHtml']:
info = comment
print(info)
and get this error:
Traceback (most recent call last):
File "coments.py", line 16, in <module>
for comment in data['headDocument']['content']['content']['bodyHtml']:
TypeError: list indices must be integers, not str
Can anyone help with this problem?
Your headDocument['content'] is a list, so you should loop through it. Like this:
for item in data['headDocument']['content']:
print(item['content']['bodyHtml'])

Regexpr in python

for printJobString in logfile:
userRegex = re.search('(\suser:\s)(.+?)(\sprinter:\s)', printJobString)
if userRegex:
userString = userRegex.group(2)
pagesInt = int(re.search('(\spages:\s)(.+?)(\scode:\s)', printJobString).group(2))
above is my code, when I run this program in the module I end up getting,
Traceback (most recent call last):
File "C:\Users\brandon\Desktop\project3\project3\pages.py", line 45, in <module>
log2hist("log") # version 2.
File "C:\Users\brandon\Desktop\project3\project3\pages.py", line 29, in log2hist
pagesInt = int(re.search('(\spages:\s)(.+?)(\scode:\s)', printJobString).group(2))
AttributeError: 'NoneType' object has no attribute 'group'
I know this error means the search is returning None but I'm not sure how to handle this case. Any help would be appreciated, very new to python and still learning the basics.
I am writing a program that should print out the number of pages a user has.
180.186.109.129 code: k n h user: luis printer: core 2 pages: 32
is a target string, my python file is trying to create a data file that has one line for each user and contains the total number of pages printed
The reason it happens is because your regexp does not find anything and returns None
re.search('(\spages:\s)(.+?)(\scode:\s)') returns None
use an if statement to test if it's not None before you try to group
for printJobString in logfile:
userRegex = re.search('(\suser:\s)(.+?)(\sprinter:\s)', printJobString)
if userRegex:
userString = userRegex.group(2)
pagesInt = re.search('(\spages:\s)(.+?)(\scode:\s)', printJobString)
if pagesInt:
pagesInt = int(pageInts.group(2))

python list substring

I am trying to read the variables from newreg.py (e.g. state, district, dcode, etc, a long list which in turn picking up data from a web form) into insertNew.py.
I have currently read the whole file into a list named 'lines'. Now, how do I filter each variable (like- state, district, etc. approx 50-55 variables. This 'list' also has html code as I have read the whole web page into it) from list 'lines'?
Is there a better and efficient way to do it ?
Once I am able to read each variable, I need to concatenate these value ( convert into string) and insert into MongoDB.
Lastly when the data has been inserted into DB, 'home.py' page opens.
I am giving details so that a complete picture is available for some solution which can be given. I hope it I have been able to keep it simple as well as complete.
I want to loop over the list (sample below) and filter out the variables (before '=' sign values). The following is in 'newreg.py' :
state = form.getvalue('state','ERROR')
district = form.getvalue('district','ERROR')
dcode = form.getvalue('Dcode','ERROR')
I read a file / page into a list
fp = open('/home/dev/wsgi-scripts/newreg.py','r')
lines = fp.readlines()
so that I can create dictionary to insert into MongoDB.eg.
info = {'state' : state , 'district' : district, . . . . }
{key : value } [value means --- which is the variable from the above list]
Thanks
but i am getting the following errors when i do
print getattr(newreg, 'state')
the error is
>>> print getattr(newreg, 'state')
Traceback (most recent call last):
File "<stdin>", line 1, in module
AttributeError: 'module' object has no attribute 'state'
I also tried
>>> print newreg.state
Traceback (most recent call last):
File "<stdin>", line 1, in module
AttributeError: 'module' object has no attribute 'state'
This is how I added the module
>>> import os,sys
>>> sys.path.append('/home/dev/wsgi-scripts/')
>>> import newreg
>>> newreg_vars = dir(newreg)
>>> print newreg_vars
['Connection', 'Handler', '__builtins__', '__doc__', '__file__', '__name__',
'__package__', 'application', 'cgi', 'datetime', 'os', 'sys', 'time']
Handler in the above list is a class in the following
#!/usr/bin/env python
import os, sys
import cgi
from pymongo import Connection
import datetime
import time
class Handler:
def do(self, environ, start_response):
form = cgi.FieldStorage(fp=environ['wsgi.input'],
environ=environ)
state = form.getvalue('state','<font color="#FF0000">ERROR</font>')
district = form.getvalue('district','<font color="#FF0000">ERROR</font>')
dcode = form.getvalue('Dcode','<font color="#FF0000">ERROR</font>')
I am assuming you want to copy the variables from one Python module to another at runtime.
import newreg
newreg_vars = dir(newreg)
print newreg_vars
will print all of the attributes of the module "newreg".
To read the variables from the module:
print getattr(newreg, 'state')
print getattr(newreg, 'district')
print getattr(newreg, 'dcode')
or if you know the names of the attributes:
print newreg.state
print newreg.district
print newreg.dcode
To change the attributes into strings, use a list comprehension (or a generator):
newreg_strings = [str(item) for item in newreg_vars]
This will save you lots of effort, as you will not have to parse "newreg" as a text file with re.
As a side note: Type conversion is not concatenation (although concatenation may involve type conversion in some other programming languages).

renderContents in beautifulsoup (python)

The code I'm trying to get working is:
h = str(heading)
# '<h1>Heading</h1>'
heading.renderContents()
I get this error:
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
print h.renderContents()
AttributeError: 'str' object has no attribute 'renderContents'
Any ideas?
I have a string with html tags and i need to clean it if there is a different way of doing that please suggest it.
Your error message and your code sample don't line up. You say you're calling:
heading.renderContents()
But your error message says you're calling:
print h.renderContents()
Which suggests that perhaps you have a bug in your code, trying to call renderContents() on a string object that doesn't define that method.
In any case, it would help if you checked what type of object heading is to make sure it's really a BeautifulSoup instance. This works for me with BeautifulSoup 3.2.0:
from BeautifulSoup import BeautifulSoup
heading = BeautifulSoup('<h1>heading</h1>')
repr(heading)
# '<h1>heading</h1>'
print heading.renderContents()
# <h1>heading</h1>
print str(heading)
# '<h1>heading</h1>'
h = str(heading)
print h
# <h1>heading</h1>

Categories