I am trying to use tanimoto similarity to compare molecular fingerprints using rdkit. I am trying to compare the two items in list 1 with the one item in list 2. However, I get getting an error. I do not understand it because I have anything named "Mol" in my code. Does anyone have any advice? Thank you
from rdkit import Chem
from rdkit.Chem import rdFingerprintGenerator
from rdkit.Chem import DataStructs
mol1 = ('CCO', 'CCOO')
mol2 = ('CC')
fii = Chem.MolFromSmiles(mol2)
fpgen1 = rdFingerprintGenerator.GetMorganGenerator(radius=2)
fps1 = [fpgen1.GetFingerprint(m) for m in fii]
for m in mol1:
fi = Chem.MolFromSmiles(m)
fpgen2 = rdFingerprintGenerator.GetMorganGenerator(radius=2)
fps2 = [fpgen2.GetFingerprint(m) for m in fi]
for x in fsp2:
t = DataStructs.TanimotoSimilarity(fps1, fps2(x))
print(t)
ERROR:
fps1 = [fpgen1.GetFingerprint(m) for m in fii]
TypeError: 'Mol' object is not iterable
The Mol object is the name of the rdkit class that is returned when you call Chem.MolFromSmiles, not one of your variable names.
The error says that the Mol object is not iterable (it is a single molecule)
from rdkit import Chem
from rdkit.Chem import rdFingerprintGenerator
from rdkit.Chem import DataStructs
smiles1 = ('CCO', 'CCOO')
smiles2 = ('CC',)
mols1 = [Chem.MolFromSmiles(smi) for smi in smiles1]
mols2 = [Chem.MolFromSmiles(smi) for smi in smiles2]
# you only need to instantiate the generator once, you can use it for both lists
fpgen = rdFingerprintGenerator.GetMorganGenerator(radius=2)
fps1 = [fpgen.GetFingerprint(m) for m in mols1]
fps2 = [fpgen.GetFingerprint(m) for m in mols2]
# if you only care about the single entry in fps2 you can just index it
for n, fp in enumerate(fps1):
t = DataStructs.TanimotoSimilarity(fp, fps2[0])
print(n, t)
Related
I am new to ABAQUS scripting and I am trying to calculate micromotion using COPEN, CSLIP1 and CSLIP2. I came up with the code below:
from abaqusConstants import *
from odbAccess import *
from odbMaterial import *
from odbSection import *
from math import *
from copy import deepcopy
from caeModules import *
from driverUtils import executeOnCaeStartup
from numpy import fabs as fabs
import numpy as np
from types import IntType
odb = session.openOdb(name='E:\PDP02.odb', readOnly=FALSE)
odb = session.odbs['E:\PDP02.odb']
print odb.rootAssembly.instances.keys()
grout_instance = odb.rootAssembly.instances['PROX-1#PROXIMAL-1']
keys = odb.steps.keys()
for key in keys:
step = odb.steps[key]
for frame in step.frames:
print frame.description
Copen = frame.fieldOutputs['COPEN']
Cslip1 = frame.fieldOutputs['CSLIP1']
Cslip2 = frame.fieldOutputs['CSLIP2']
Micromotion = sqrt(power(Copen,2)+power(Cslip1,2)+power(Cslip2,2))
#Micromotion =sqrt(power(Cslip2,2))
#float(Micromotion)
frame.FieldOutput(name='Micromotion', description='Average Micromotion', field=Micromotion)
odb.update()
odb.save()
After executing the code, i get the following error message: "OdiError: Expression evaluates to an overflow or underflow". Please help me understand this error message and how to rectify it. I am happy to provide the .inp and .odb files for reference and verification.
Simply put, overflow and underflow happen when we assign a value that is out of range of the declared data type of the variable. If the (absolute) value is too big, we call it overflow, if the value is too small, we call it underflow.
import networkx as nx
import matplotlib as plt
def plot_deg_dist(G):
all_degrees = nx.degree(G).values()
unique_degrees = list(set(all_degrees))
count_of_degrees = []
for i in unique_degrees:
x = all_degrees.count(i)
count_of_degrees.append(x)
plt.pyplot.plot(unique_degrees, count_of_degrees)
plt.pyplot.show()
G = nx.read_pajek('karate.paj')
plot_deg_dist(G)
Output:
Traceback (most recent call last):
File "datasets.py", line 19, in
plot_deg_dist(G)
File "datasets.py", line 5, in plot_deg_dist
all_degrees = nx.degree(G).values()
AttributeError: 'MultiDegreeView' object has no attribute 'values'
Instead of nx.degree(G).values
According to doc You can use,
list(G.degree(list(G.nodes)))
Try converting it to a dictionary first
all_degrees = dict(nx.degree(G)).values()
Try with list comprehention
def plot_deg_dist(G):
all_degrees = [degree for node, degree in G.degree()]
unique_degrees = list(set(all_degrees))
count_of_degrees = []
for degree in unique_degrees:
count = all_degrees.count(degree)
count_of_degrees.append(count)
plt.plot(unique_degrees, count_of_degrees, 'ro-')
I saw this question and answer about using fft on wav files and tried to implement it like this:
import matplotlib.pyplot as plt
from scipy.io import wavfile # get the api
from scipy.fftpack import fft
from pylab import *
import sys
def f(filename):
fs, data = wavfile.read(filename) # load the data
a = data.T[0] # this is a two channel soundtrack, I get the first track
b=[(ele/2**8.)*2-1 for ele in a] # this is 8-bit track, b is now normalized on [-1,1)
c = fft(b) # create a list of complex number
d = len(c)/2 # you only need half of the fft list
plt.plot(abs(c[:(d-1)]),'r')
savefig(filename+'.png',bbox_inches='tight')
files = sys.argv[1:]
for ele in files:
f(ele)
quit()
But whenever I call it:
$ python fft.py 0.0/4515-11057-0058.flac.wav-16000.wav
I get the error:
Traceback (most recent call last):
File "fft.py", line 18, in <module>
f(ele)
File "fft.py", line 10, in f
b=[(ele/2**8.)*2-1 for ele in a] # this is 8-bit track, b is now normalized on [-1,1)
TypeError: 'numpy.int16' object is not iterable
How can I create a script that generates frequency distributions for each file in the list of arguments?
Your error message states that you are trying to iterate over an integer (a). When you define a via
a = data.T[0]
you grab the first value of data.T. Since your data files are single channel, you are taking the first value of the first channel (an integer). Changing this to
a = data.T
will fix your problem.
I have the following father class and method:
import SubImage
import numpy as np
from scipy import misc
import random
class Image():
# Class constructor
def __init__(self):
self.__image = np.empty(0)
self.__rows = 0
self.__cols = 0
self.__rows_pixels = 0
self.__cols_pixels = 0
self.__rows_quotient = 0.0
self.__cols_quotient = 0.0
self.__create_image()
self.__subimages = np.empty((self.__rows, self.__cols))
def __create_subimages(self):
i = 0
j = 0
while i != self.__rows_quotient * self.__rows:
print (i+j)
sub_image = SubImage(self.__image[i:i + self.__rows_quotient, j:j + self.__cols_quotient], i + j)
if j == self.__cols_quotient * (self.__cols - 1):
j = 0
i += self.__rows_quotient
else:
j += self.__cols_quotient
And the following subclass which is supposed to be a child from the class above:
import Image
class SubImage(Image):
def __init__(self, image, position):
self.__position = position
self.__image = image
My problem is that when creating a SubImage instance in the __create_subimages method I get the following error:
File "/home/mitolete/PycharmProjects/myprojectSubImage.py", line 3, in <module>
class SubImage(Image):
TypeError: Error when calling the metaclass bases
module.__init__() takes at most 2 arguments (3 given)
I don't get why it says I'm giving 3 arguments, I'm giving 2 which is the subimage (a numpy array) and an integer.
WHy is this?
Regards and thanks.
Your main problem is the way you import both Image and Subimage into each other.
Subimage should be imported this way:
from myprojectSubImage import SubImage
Image should be imported this way:
from FILENAME import Image
that being said, the mutual import seems like bad practice. you should probably either merge the Image and SubImage file, or move the 'create_subimages' function to another file.
If you're importing SubImage from another file, i.e. a module, you will have to reference that in the import. In this case, assuming SubImage is in a file called SubImage.py, the import should be
from SubImage import SubImage
so that SubImage now refers to the class SubImage in SubImage.py. This is also the case for Image in Image.py.
However, I don't think there's a need to do this given how closely related the two classes are. I'd put them in the same file and avoid the circular import.
I'm unable to make the following code work, though I don't see this error working strictly in R.
from rpy2.robjects.packages import importr
from rpy2 import robjects
import numpy as np
forecast = importr('forecast')
ts = robjects.r['ts']
y = np.random.randn(50)
X = np.random.randn(50)
y = ts(robjects.FloatVector(y), start=robjects.IntVector((2004, 1)), frequency=12)
X = ts(robjects.FloatVector(X), start=robjects.IntVector((2004, 1)), frequency=12)
forecast.Arima(y, xreg=X, order=robjects.IntVector((1, 0, 0)))
It's especially confusing considering the following code works fine
forecast.auto_arima(y, xreg=X)
I see the following traceback no matter what I give for X, using numpy interface or not. Any ideas?
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-20-b781220efb93> in <module>()
13 X = ts(robjects.FloatVector(X), start=robjects.IntVector((2004, 1)), frequency=12)
14
---> 15 forecast.Arima(y, xreg=X, order=robjects.IntVector((1, 0, 0)))
/home/skipper/.local/lib/python2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
84 v = kwargs.pop(k)
85 kwargs[r_k] = v
---> 86 return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
/home/skipper/.local/lib/python2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
33 for k, v in kwargs.iteritems():
34 new_kwargs[k] = conversion.py2ri(v)
---> 35 res = super(Function, self).__call__(*new_args, **new_kwargs)
36 res = conversion.ri2py(res)
37 return res
RRuntimeError: Error in `colnames<-`(`*tmp*`, value = if (ncol(xreg) == 1) nmxreg else paste(nmxreg, :
length of 'dimnames' [2] not equal to array extent
Edit:
The problem is that the following lines of code do not evaluate to a column name, which seems to be the expectation on the R side.
sub = robjects.r['substitute']
deparse = robjects.r['deparse']
deparse(sub(X))
I don't know well enough what the expectations of this code should be in R, but I can't find an RPy2 object that passes this check by returning something of length == 1. This really looks like a bug to me.
R> length(deparse(substitute((rep(.2, 1000)))))
[1] 1
But in Rpy2
[~/]
[94]: robjects.r.length(robjects.r.deparse(robjects.r.substitute(robjects.r('rep(.2, 1000)'))))
[94]:
<IntVector - Python:0x7ce1560 / R:0x80adc28>
[ 78]
This is one manifestation (see this other related issue for example) of the same underlying issue: R expressions are evaluated lazily and can be manipulated within R and this leads to idioms that do not translate well (in Python expression are evaluated immediately, and one has to move to the AST to manipulate code).
An answers to the second part of your question. In R, substitute(rep(.2, 1000)) is passing the unevaluated expression rep(.2, 1000) to substitute(). Doing in rpy2
substitute('rep(.2, 1000)')`
is passing a string; the R equivalent would be
substitute("rep(.2, 1000)")
The following is letting you get close to R's deparse(substitute()):
from rpy2.robjects.packages import importr
base = importr('base')
from rpy2 import rinterface
# expression
e = rinterface.parse('rep(.2, 1000)')
dse = base.deparse(base.substitute(e))
>>> len(dse)
1
>>> print(dse) # not identical to R
"expression(rep(0.2, 1000))"
Currently, one way to work about this is to bind R objects to R symbols
(preferably in a dedicated environment rather than in GlobalEnv), and use
the symbols in an R call written as a string:
from rpy2.robjects import Environment, reval
env = Environment()
for k,v in (('y', y), ('xreg', X), ('order', robjects.IntVector((1, 0, 0)))):
env[k] = v
# make an expression
expr = rinterface.parse("forecast.Arima(y, xreg=X, order=order)")
# evaluate in the environment
res = reval(expr, envir=env)
This is not something I am happy about as a solution, but I have never found the time to work on a better solution.
edit: With rpy2-2.4.0 it becomes possible to use R symbols and do the following:
RSymbol = robjects.rinterface.SexpSymbol
pairlist = (('x', RSymbol('y')),
('xreg', RSymbol('xreg')),
('order', RSymbol('order')))
res = forecast.Arima.rcall(pairlist,
env)
This is not yet the most intuitive interface. May be something using a context manager would be better.
there is a way to just simply pass your variables to R without sub-situations and return the results back to python. You can find a simple example here https://stackoverflow.com/a/55900840/5350311 . I guess it is more clear what you are passing to R and what you will get back in return, specially if you are working with For loops and large number of variables.