Python-Weka-Wrapper3 removing attributes from arff file error - python

I have an arff file and I need to remove the first 5 attributes from it (without manually deleting them). I tried to use the Python-Weka-Wrapper3 as it is explained here which enables the filtering options of Weka, however I get an error while using the following code:
import weka.filters as Filter
remove = Filter(classname="weka.filters.unsupervised.attribute.Remove", options=["-R", "1,2,3,4,5"])
The error that I receive is the following:
Traceback (most recent call last):
File "/home/user/Desktop/file_loading.py", line 16, in <module>
removing = Filter(classname="weka.filters.unsupervised.attribute.Remove", options=["-R", "last"])
TypeError: 'module' object is not callable
What could be the reason for this error? Also I would appreciate if anyone knows an alternative way to remove attributes from an arff file using Python.

You are attempting to call the module object instead of the class object.
Try using:
from weka.filters import Filter
remove = Filter(classname="weka.filters.unsupervised.attribute.Remove", options=["-R", "1,2,3,4,5"])

Related

Python loadarff fails for string attributes

I am trying to load an arff file using Python's 'loadarff' function from scipy.io.arff. The file has string attributes and it is giving the following error.
>>> data,meta = arff.loadarff(fpath)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/data/home/eex608/conda3_envs/PyT3/lib/python3.6/site-packages/scipy/io/arff/arffread.py", line 805, in loadarff
return _loadarff(ofile)
File "/data/home/eex608/conda3_envs/PyT3/lib/python3.6/site-packages/scipy/io/arff/arffread.py", line 838, in _loadarff
raise NotImplementedError("String attributes not supported yet, sorry")
NotImplementedError: String attributes not supported yet, sorry
How to read the arff successfully?
Since SciPy's loadarff converts containings of arff file into NumPy array, it does not support strings as attributes.
In 2020, you can use liac-arff package.
import arff
data = arff.load(open('your_document.arff', 'r'))
However, make sure your arff document does not contain inline comments after a meaningful text.
So there won't be such inputs:
#ATTRIBUTE class {F,A,L,LF,MN,O,PE,SC,SE,US,FT,PO} %Check and make sure that FT and PO should be there
Delete or move comment to the next line.
I'd got such mistake in one document and it took some time to figure out what's wrong.

Why does a 'WriteOnlyWorksheet' object have no attribute 'cell'?

import openpyxl
wb=openpyxl.Workbook("multiplication.xlsx")
wb.create_sheet()
sheet=wb.get_active_sheet()
sheet.cell(column=6, row=4).value= 5
wb.save("multiplication.xlsx")
When i try and write in the cell, I receive this error.
Traceback (most recent call last):
File "/Users/bjg/Desktop/excel2.py", line 8, in <module>
sheet.cell(column=6, row=4).value= 5
AttributeError: 'WriteOnlyWorksheet' object has no attribute 'cell'
I was wondering if anybody knew why this was the case?
From the write-only mode docs:
In a write-only workbook, rows can only be added with append(). It is not possible to write (or read) cells at arbitrary locations with cell() or iter_rows().
Instead of doing:
wb=openpyxl.Workbook("multiplication.xlsx")
just do:
wb=openpyxl.Workbook()
then at last save with:
wb.save("multiplication.xlsx")

Cannot perform operations using rpy2 in Python: "TypeError: argument 1 must be a str, not int"

So I'm trying to get to grips with using the rpy2 module (I am familiar with R but new to Python). Following this tutorial, I first load the library and assign it to the variable 'r' using:
import rpy2
import rpy2.robjects as robjects
r = robjects.r
then I try to perform a simple operation to confirm everything is working:
print(r[2+2])
but I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python34\lib\site-packages\rpy2\robjects\__init__.py", line 248, in _
_getitem__
res = _globalenv.get(item)
TypeError: argument 1 must be str, not int
I'm sure it's just something stupid I'm doing wrong, but any advice would be much appreciated. I'm using python3.4.2 (64bit), rpy2-2.5.6 (64bit) on a Windows 7 machine (64bit).
You should use print(r(2+2)) instead of print(r[2+2]).
When you use r[2+2] you are trying to recover an element corresponding to the index 4 (the result of 2+2) of the r iterable. And your r object doesn't seem to respond to this kind of message.
Ok I think I have figured it out. For R to evaluate the function inside the parenthesis, the function must be in quotes e.g.
r("2+2")
This is what was confusing me because this looks like I'm providing a string.
Oddly I don't print the result (4) by using:
print(r("2+2"))
as this prints:
Traceback (most recent call last):
File "<pyshell#31>", line 1, in <module>
print(r("2+2"))
File "C:\Python34\lib\site-packages\rpy2\robjects\robject.py", line 49, in __str__
s = str.join(os.linesep, s)
TypeError: sequence item 0: expected str instance, bytes found
Instead I just print the result using:
answer = r("2+2")
answer[0]
(Because R is vector based, the initial value of the vector is the answer so you have to index it at the first position, otherwise you get:
answer = r("2+2")
answer
<FloatVector - Python:0x0000000005836EC8 / R:0x00000000047A51A0>
[4.000000]
Thanks for you help
Hefin

Python package: Bioservices, error using UniChem() command

I was following the tutorial on the webpage:
http://pythonhosted.org/bioservices/compound_tutorial.html
Everything worked well until I reached the following command:
uni = UniChem()
and then I received the error message:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "P:\Anaconda\lib\site-packages\bioservices\unichem.py", line 84, in __init__
maxid_service = int(self.get_all_src_ids()[-1]['src_id'])
TypeError: list indices must be integers, not str
As a minimum working example:
from bioservices import *
uni = UniChem()
and then I receive the error. I understand the error (for the most part) but I don't know how to fix it. So my question is how do I fix the function or work around it?
The overall aim it to map a list of 1000 drug names (and hopefully more in the near future) to Chembl IDs.
The error you saw is probably related to the fact that when you tried to connect to UniChem service, it was off for maintenance or it took too much time to initialize. The consequence is that the service was not started hence the error message you got.
I've just tried (bioservices 1.2.6)
from bioservices import *
uni = UniChem()
and it worked. The following request also worked:
>>> mapping = uni.get_mapping("kegg_ligand", "chembl")
'CHEMBL278315'

Problems using PIL 1.1.7

The trivial
import Image
im = Image.OPEN('C:\abc.bmp')
results in the following exception
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
im = Image.OPEN('C:\Documents and Settings\umair.ahmed\My Documents\My Pictures\avanza.bmp')
TypeError: 'dict' object is not callable
not sure if i am missing something, kindly help.
Use:
Image.open()
It's case-sensitive.
I don't think the error message came from your input, because the file names are different, but you should not use 'C:\abc.bmp', in your open() call, but use either C:/abc.bmp, or r'C:\abc.bmp'. Backslash is an escape character in Python.

Categories