Biopython Array Addition Error (Open for all) - python

Okay. Let me explain the things first. I have used a specific module named Biopython in this code. I am explaining the necessary details to solve the problem if you are not accustomed with the module.
The code is:
#!/usr/bin/python
from Bio.PDB.PDBParser import PDBParser
import numpy as np
parser=PDBParser(PERMISSIVE=1)
structure_id="mode_7"
filename="mode_7.pdb"
structure=parser.get_structure(structure_id, filename)
model1=structure[0]
s=(124,3)
newc=np.zeros(s,dtype=np.float32)
coord=[]
#for chain1 in model1.get_list():
# for residue1 in chain1.get_list():
# ca1=residue1["CA"]
# coord1=ca1.get_coord()
# newc.append(coord1)
for i in range(0,29):
model=structure[i]
for chain in model.get_list():
for residue in chain.get_list():
ca=residue["CA"]
coord.append(ca.get_coord())
newc=np.add(newc,coord)
print newc
print "END"
PDB file is the protein data bank file. The file I'm working with can be downloaded from https://drive.google.com/open?id=0B8oUhqYoEX6YVFJBTGlNZGNBdlk
If you remove the hashes from the first for loop, you'll find that get_coord() returns a (124,3) array with dtype float32. Likewise, the next for loop is supposed to return the same.
It gives out a strange error:
Traceback (most recent call last):
File "./average.py", line 27, in <module>
newc=np.add(newc,coord)
ValueError: operands could not be broadcast together with shapes (124,3) (248,3)
I am absolutely clueless how it manages to make a 248,3 array. I just want to add the array coord over itself. I tried with another modification of the code:
#!/usr/bin/python
from Bio.PDB.PDBParser import PDBParser
import numpy as np
parser=PDBParser(PERMISSIVE=1)
structure_id="mode_7"
filename="mode_7.pdb"
structure=parser.get_structure(structure_id, filename)
model1=structure[0]
s=(124,3)
newc=np.zeros(s,dtype=np.float32)
coord=[]
newc2=[]
#for chain1 in model1.get_list():
# for residue1 in chain1.get_list():
# ca1=residue1["CA"]
# coord1=ca1.get_coord()
# newc.append(coord1)
for i in range(0,29):
model=structure[i]
for chain in model.get_list():
for residue in chain.get_list():
ca=residue["CA"]
coord.append(ca.get_coord())
newc2=np.add(newc,coord)
print newc
print "END"
It gives out the same error. Can you help???

I'm not sure I fully understand what you're doing, but it looks like you need to reset the coords list at the start of every iteration:
for i in range(0,29):
coords = []
model=structure[i]
for chain in model.get_list():
for residue in chain.get_list():
ca=residue["CA"]
coord.append(ca.get_coord())
newc=np.add(newc,coord)
If you keep appending without clearing the list you add 124 items to coords at every iteration of the outer loop. The exception you see is likely raised during the second iteration.

Related

I am trying to get inverse of a matrix from a file in python, but the format is in string and even after changing it into float I am unable to do it

I am trying to write a code which reads this file and gives the inverse of square root of each term in a matrix. This is the file I am using:
1.659999999999999963e-04
3.970000000000000005e-04
-8.014499999999999402e-02
-2.274299999999999933e-02
-7.559999999999999880e-03
-3.156229999999999869e-01
5.650100000000000261e-02
2.350100000000000106e-02
-4.383999999999999876e-03
-4.878299999999999997e-02
1.207599999999999993e-02
-5.254199999999999843e-02
1.123500000000000019e-02
1.614240000000000119e-01
1.954900000000000040e-02
-2.614100000000000104e-02
1.534899999999999980e-02
5.446000000000000320e-03
-6.210299999999999848e-02
-9.615000000000000283e-03
1.687800000000000064e-02
6.460999999999999729e-03
-9.490999999999999437e-03
1.676700000000000065e-02
-2.308000000000000156e-03
-1.412399999999999940e-02
8.978899999999999382e-02
1.848960000000000048e-01
5.956000000000000356e-03
-5.592300000000000049e-02
1.114599999999999966e-02
-5.689600000000000213e-02
-6.731000000000000004e-03
2.572999999999999940e-02
1.512000000000000106e-03
-3.237999999999999993e-03
-4.068999999999999700e-03
-1.234000000000000071e-03
2.378109999999999946e-01
-1.128000000000000096e-03
-3.534999999999999948e-03
-4.550000000000000008e-04
1.479999999999999925e-04
5.220000000000000031e-04
3.718099999999999877e-02
1.104580000000000006e-01
1.965000000000000167e-03
4.266999999999999960e-03
-5.140999999999999737e-03
1.648640000000000105e-01
1.776220000000000021e-01
1.922000000000000097e-03
3.250600000000000017e-02
4.402899999999999869e-02
-8.430999999999999259e-03
4.409999999999999858e-04
1.389999999999999905e-04
1.374209999999999876e-01
-2.431860000000000133e-01
-1.727000000000000019e-03
-2.280000000000000126e-04
8.100000000000000375e-05
-7.480999999999999803e-03
8.000000000000000654e-05
-3.939999999999999817e-04
1.441000000000000007e-03
-7.290000000000000473e-04
-3.663000000000000284e-02
-1.657999999999999969e-03
-8.369999999999999619e-04
-6.904999999999999680e-03
1.593100000000000072e-02
-3.393000000000000183e-03
1.495999999999999934e-03
-7.368999999999999682e-03
1.436199999999999977e-02
-1.319700000000000040e-02
-4.557000000000000287e-03
8.123700000000000365e-02
2.447399999999999923e-02
-1.295199999999999997e-02
-8.722100000000000686e-02
-5.232999999999999804e-03
-1.255940000000000112e-01
1.291999999999999963e-03
-1.382999999999999898e-03
4.989999999999999644e-03
1.508000000000000009e-03
2.304399999999999851e-02
2.819400000000000031e-02
3.119999999999999944e-04
-8.781999999999999876e-03
6.794500000000000539e-02
6.198999999999999649e-03
-2.058879999999999877e-01
9.219999999999999680e-04
-1.618800000000000100e-02
-3.415860000000000007e-01
-1.660999999999999933e-03
-4.889999999999999599e-04
1.759999999999999954e-04
-3.763999999999999985e-03
-6.566600000000000215e-02
-7.680000000000000195e-04
-1.231799999999999978e-01
7.047999999999999578e-03
1.425000000000000051e-02
-2.900799999999999906e-02
1.187499999999999944e-01
-1.449199999999999933e-01
-1.106999999999999911e-03
-1.557999999999999923e-03
-2.236999999999999839e-03
-7.270699999999999386e-02
-5.140000000000000254e-04
2.246999999999999865e-03
-1.778949999999999976e-01
1.669599999999999904e-02
-1.277799999999999943e-02
-2.379040000000000044e-01
-2.207999999999999893e-03
1.925000000000000062e-03
7.750100000000000044e-02
-5.004100000000000215e-02
1.704999999999999918e-03
3.272400000000000309e-02
1.957499999999999865e-02
-1.514620000000000133e-01
-3.288999999999999823e-03
-3.605699999999999877e-02
3.648999999999999900e-03
3.459799999999999681e-02
-1.859999999999999945e-04
3.300000000000000253e-05
3.000000000000000076e-06
9.999999999999999547e-07
4.800000000000000122e-05
1.361999999999999929e-03
-6.057300000000000184e-02
5.689999999999999529e-04
-5.000000000000000409e-06
2.984699999999999853e-02
6.999999999999999387e-05
4.600000000000000004e-05
-1.294499999999999991e-02
-2.318000000000000182e-03
-0.000000000000000000e+00
1.858200000000000129e-01
-5.969959999999999711e-01
6.000000000000000152e-06
The code I have tried to write is this:
from ast import Num
from cmath import sqrt
import math
import numpy as np
from numpy import append
f1= open('diagonal.txt', 'r')
k=f1.readlines()
#def mkstr(s):
# str1=" "
# return (str1.join(s))
#print(float(mkstr(k)))
#for j in f1:
#inverse_root(j)
#def mkstr2():
# f1=open("diagonal.txt", "r")
# k=f1.readlines()
# fl=[float(item) for item in k]
# inverse_root(fl)
#for j in mkstr2():
#inverse_root(j)
f=[float(x) for x in k]
sa=np.array(f)
ca=sa.astype(np.float)
def inverse_root(j):
return [1/(sqrt(j))]
j=[inverse_root(ca)]
I am getting the output like this:
DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ca=sa.astype(np.float)
Traceback (most recent call last):
File "/home/gian-2018/prateek/python/read_file/27/5/rmwpd.py", line 34, in <module>
j=[inverse_root(ca)]
File "/home/gian-2018/prateek/python/read_file/27/5/rmwpd.py", line 33, in inverse_root
return [1/(sqrt(j))]
TypeError: only length-1 arrays can be converted to Python scalars
So to get inverse of sqrt for each term from the matrix file: I tried to convert the list of these strings into float list, but even after I get error for trying to get list of these numbers into the formula of inverse sqrt instead of using a real number.
Now I am trying to create the inverse formula suitable for an array which means creating a new array which converts each term in the previous matrix into their inverse sqrt and saves the value into the new array.
By any way possible please either tell me how to convert this list into real numbers or convert each term of this matrix into its inverse sqrt and save it into new matrix.
Or just tell me how to implement this task in any way possible if you have understood it.

TypeError: 'method' object cannot be interpreted as an integer

This should be quick and easy, I'm just trying to convert some VBA into python and I believe I don't understand how loops work here.
Basically I'm trying to count howmany series there are in a chart, then iterate through those series with for iseries in range(1, nseries):
And I end up getting the following error:
Traceback (most recent call last): File "xx.py", line 10, in
for iseries in range(1, nseries): TypeError: 'method' object cannot be interpreted as an integer
Full script below. The print statements is my attempt at trying to see if the loop worked correctly and counted the correct amount of series/points. That also seems to not work as nothing has been printed, so maybe this is the issue?:
from pptx import Presentation
prs = Presentation('Test.pptx')
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_chart:
continue
nseries = shape.chart.series.count
print('Series number:', nseries)
for iseries in range(1, nseries):
series = shape.chart.series(iseries)
npoint = series.points.count
print('Point number:', npoint)
prs.save('test3.pptx')
This is likely because count is a function, not an attribute. Try changing the line to:
nseries = shape.chart.series.count()
A better way to loop through the series, however, would be to do it directly instead of using an index:
for series in shape.chart.series:
# do something with series

I need to FFT wav files?

I saw this question and answer about using fft on wav files and tried to implement it like this:
import matplotlib.pyplot as plt
from scipy.io import wavfile # get the api
from scipy.fftpack import fft
from pylab import *
import sys
def f(filename):
fs, data = wavfile.read(filename) # load the data
a = data.T[0] # this is a two channel soundtrack, I get the first track
b=[(ele/2**8.)*2-1 for ele in a] # this is 8-bit track, b is now normalized on [-1,1)
c = fft(b) # create a list of complex number
d = len(c)/2 # you only need half of the fft list
plt.plot(abs(c[:(d-1)]),'r')
savefig(filename+'.png',bbox_inches='tight')
files = sys.argv[1:]
for ele in files:
f(ele)
quit()
But whenever I call it:
$ python fft.py 0.0/4515-11057-0058.flac.wav-16000.wav
I get the error:
Traceback (most recent call last):
File "fft.py", line 18, in <module>
f(ele)
File "fft.py", line 10, in f
b=[(ele/2**8.)*2-1 for ele in a] # this is 8-bit track, b is now normalized on [-1,1)
TypeError: 'numpy.int16' object is not iterable
How can I create a script that generates frequency distributions for each file in the list of arguments?
Your error message states that you are trying to iterate over an integer (a). When you define a via
a = data.T[0]
you grab the first value of data.T. Since your data files are single channel, you are taking the first value of the first channel (an integer). Changing this to
a = data.T
will fix your problem.

Python appending a list in a for loop with numpy array data

I am writing a program that will append a list with a single element pulled from a 2 dimensional numpy array. So far, I have:
# For loop to get correlation data of selected (x,y) pixel for all bands
zdata = []
for n in d.bands:
cor_xy = np.array(d.bands[n])
zdata.append(cor_xy[y,x])
Every time I run my program, I get the following error:
Traceback (most recent call last):
File "/home/sdelgadi/scr/plot_pixel_data.py", line 36, in <module>
cor_xy = np.array(d.bands[n])
TypeError: only integer arrays with one element can be converted to an index
My method works when I try it from the python interpreter without using a loop, i.e.
>>> zdata = []
>>> a = np.array(d.bands[0])
>>> zdata.append(a[y,x])
>>> a = np.array(d.bands[1])
>>> zdata.append(a[y,x])
>>> print(zdata)
[0.59056658, 0.58640128]
What is different about creating a for loop and doing this manually, and how can I get my loop to stop causing errors?
You're treating n as if it's an index into d.bands when it's an element of d.bands
zdata = []
for n in d.bands:
cor_xy = np.array(n)
zdata.append(cor_xy[y,x])
You say a = np.array(d.bands[0]) works. The first n should be exactly the same thing as d.bands[0]. If so then np.array(n) is all you need.

receiving "ValueError: setting an array element with a sequence."

I've been having some problems with this code, trying to end up with an inner product of two 1-D arrays. The code of interest looks like this:
def find_percents(i):
percents=[]
median=1.5/(6+2*int(i/12))
b=2*median
m=b/(7+2*int(i/12))
for j in xrange (1,6+2*int(i/12)):
percents.append(float((b-m*j)))
percentlist=numpy.asarray(percents, dtype=float)
#print percentlist
total=sum(percentlist)
return total, percentlist
def playerlister(i):
players=[]
for i in xrange(i+1,i+6+2*int(i/12)):
position=sheet.cell(i,2)
points=sheet.cell(i,24)
if re.findall('RB', str(position.value)):
vbd=points.value-rbs[24]
players.append(vbd)
else:
pass
playerlist=numpy.asarray(players, dtype=float)
return playerlist
def others(i,percentlist,playerlist,total):
alternatives=[]
playerlist=playerlister(i)
percentlist=find_percents(i)
players=numpy.dot(playerlist,percentlist)
I am receiving the following error in response to the very last line of this attached code:
ValueError: setting an array element with a sequence.
In most other examples of this error, I have found the error to be because of incorrect data types in the arrays percentlist and playerlist, but mine should be float type. If it helps at all, I call these functions a little later in the program, like so:
for i in xrange(1,30):
total, percentlist= find_percents(i)
playerlist= playerlister(i)
print type(playerlist[i])
draft_score= others(i,percentlist,playerlist,total)
Can anyone help me figure out why I am setting an array element with a sequence? Please let me know if any more information might be helpful! Also for clarity, the playerlister is making use of the xlrd module to extract data from a spreadsheet, but the data are numerical and testing has shown that that both lists have a type of numpy.float64.
The shape and contents of each of these for one iteration of i is
<type 'numpy.float64'>
(5,)
[ 73.7 -94.4 140.9 44.8 130.9]
(5,)
[ 0.42857143 0.35714286 0.28571429 0.21428571 0.14285714]
Your function find_percents returns a two-element tuple.
When you call it in others, you are binding that tuple to the variable named percentlist, which you then try to use in a dot-product.
My guess is that by writing this in others it is fixed:
def others(i,percentlist,playerlist,total):
playerlist = playerlister(i)
_, percentlist = find_percents(i)
players = numpy.dot(playerlist,percentlist)
provided of course playerlist and percentlist always have the same number of elements (which we can't check because of the missing spreadsheet).
To verify, the following gives you the exact error message and the minimum of code needed to reproduce it:
>>> import numpy as np
>>> a = np.arange(5)
>>> np.dot(a, (2, a))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.

Categories