How to solve "ECitMatch() got multiple values for argument 'bdata'"? - python

I am new to use bioservices Python package. Now I am going to use that to retrieve PMIDs for two citations, given the specified information and this is the code I have tried:
from bioservices import EUtils
s = EUtils()
print(s.ECitMatch("pubmed",retmode="xml", bdata="proc+natl+acad+sci+u+s+a|1991|88|3248|mann+bj|Art1|%0Dscience|1987|235|182|palmenberg+ac|Art2|"))
But it occurs an error:
"TypeError: ECitMatch() got multiple values for argument 'bdata'".
Could anyone help me to solve that problem?

I think the issue is that you have an unnamed argument (pubmed); if you look at the source code, you can see that the first argument should be bdata; if you provide the arguments like you do, it is, however, unclear whether bdata is "pubmed" or the named argument bdata, therefore the error you obtain.
You can reproduce it with this minimal example:
def dummy(a, b):
return a, b
dummy(10, a=3)
will return
TypeError: dummy() got multiple values for argument 'a'
If you remove "pubmed", the error disappears, however, the output is still incomplete:
from bioservices import EUtils
s = EUtils()
print(s.ECitMatch("proc+natl+acad+sci+u+s+a|1991|88|3248|mann+bj|Art1|%0Dscience|1987|235|182|palmenberg+ac|Art2|"))
returns
'proc+natl+acad+sci+u+s+a|1991|88|3248|mann+bj|Art1|2014248\n'
so only the first publication is taken into account. You can get the results for both by using the correct carriage return character \r:
print(s.ECitMatch(bdata="proc+natl+acad+sci+u+s+a|1991|88|3248|mann+bj|Art1|\rscience|1987|235|182|palmenberg+ac|Art2|"))
will return
proc+natl+acad+sci+u+s+a|1991|88|3248|mann+bj|Art1|2014248
science|1987|235|182|palmenberg+ac|Art2|3026048
I think you neither have to specify retmod nor the database (pubmed); if you look at the source code I linked above you can see:
query = "ecitmatch.cgi?db=pubmed&retmode=xml"
so seems it always uses pubmed and xml.

Two issues here: syntaxic and a bug.
The correct syntax is:
from bioservices import EUtils
s = EUtils()
query = "proc+natl+acad+sci+u+s+a|1991|88|3248|mann+bj|Art1|%0Dscience|1987|235|182|palmenberg+ac|Art2|"
print(s.ECitMatch(query))
Indeed, the underlying service related to ICitMatch has only one database (pubmed) and one format (xml) hence, those 2 parameters are not available : there are hard-coded. Therefore, only one argument is required: your query.
As for the second issue, as pointed above and reported on the bioservices issues page, your query would return only one publication. This was an issue with the special character %0D (in place of a return carriage) not being interpreted corectly by the URL request. This carriage character (either \n, \r or %0d) is now taken into account in the latest version on github or from pypi website if you use version 1.7.5
Thanks to willigot for filling the issue on bioservices page and bringing it to my attention.
disclaimer: i'm the main author of bioservices

Related

Return PLS-00306 During login in with python

I am working on a crawler using Python to grab some data on company internal web.but when I posted all the data,it showed PLS-00306 wrong number or type of arguments in call to PM_USER_LOGIN_SP
ORA-066550:line 1, column 7
PL/SQL: Statement ignored
I checked my Firefox inspector again and again, and all my request data were right, even I removed some of my request data or changed it, it returned another error code.
Is there someone help me out what's the problem.
Oracle procedure PM_USER_LOGIN_SP has one or more parameters, each of them having its own data type. When calling that procedure, you must match number and data type of each of them.
For example, if it expects 3 parameters, you can't pass only 2 (nor 4) of them (because of wrong number of arguments (parameters)).
If parameter #1 is DATE, you can't pass letter A to it (because of a wrong type). Note that DATEs are kind of "special", because something that looks like a date to us, humans (such as 20.01.2018, which is today) passed to Oracle procedure's DATE data type parameter must really be a date. '20.01.2018' is a string, so either pass date literal, such as DATE '2018-01-20' or use appropriate function with a format mask, TO_DATE('20.01.2018', 'dd.mm.yyyy').
Therefore, have a look at the procedure first, pay attention to what it expects. Then check what you pass to it.

Python 2.7 replace all instances of NULL / NONE in complex JSON object

I have the following code..
.... rest api call >> response
rsp = response.json()
print json2html.convert(rsp)
which results in the following
error: Can't convert NULL!
I therefore started looking into schemes to replace all None / Null's in my JSON response, but I'm having an issue since the JSON returned from the api is complex and nested many levels and I don't know where the NULL will actually appear.
From what I can tell I need to iterate over the dictionary objects recursively and check for any values that are NONE and actually rebuild the object with the values replaced, but I don't really know where to start since dictionary objects are immutable..
If you look at json2html's source it seems like you have a different problem - and the error message is not helping.
Try to use it like this:
print json2html.convert(json=rsp)
btw. because I've already contributed to that project a bit I've opened up the following PR due to this question: https://github.com/softvar/json2html/pull/20

Issue with "ValueError: Single '}' encountered in format string" in an API Call

I'm using Python 3 for this.
Basically, I'm making a API call using urllib and getting the error:
"ValueError: Single '}' encountered in format string"
I have looked at a variety of other solutions but they don't seem to work.
Basically what I am doing is:
import urllib.request
import urllib.parse
def query_person(first, last):
person_request = urllib.request.urlopen('http://api.querysite.com/content/search/index:AUTHOR?query=authlast%28%27{last}}%27%29%20AND%20authfirst%28%27{first}}%27%29&'.format(first=first,last=last))
return(person_request)
print(query_person("John", "Doe"))
The actual API will not be reproducible since it requires an API key (ommited obviously) as well as the need to be on a verified network.
I think the issue has to do with "{last}}%27%29%20AND%20authfirst%28%27{first}}" having an extra bracket. For example, if I wanted to just query it in my url bar without python or .format(), it would look like:
http://api.querysite.com/content/search/index:AUTHOR?query=authlast%28%27Doe}%27%29%20AND%20authfirst%28%27John}%27%29&
or more specifically: Doe}%27%29%20AND%20authfirst%28%27John}%27%29&
If I use the latter method in python, I have no issues, but it does not allow me to input names to query of course.
You need to double up on your single brace if you want it to remain in the string:
For example:
'{first}}}'.format(first='John') == 'John}'
In your case:
person_request = urllib.request.urlopen('http://api.querysite.com/content/search/index:AUTHOR?query=authlast%28%27{last}}}%27%29%20AND%20authfirst%28%27{first}}}%27%29&'.format(first=first,last=last))

Finding documentation on names returned by dir()

If you run the following code:
from flask import Flask
import unittest
dir(Flask(__name__).test_client())
The following is output to terminal:
There are a number of names returned that I cannot find documentation on (all of the names that are not surrounded by double underscores).
I have found indirect reference to post here (if you search for 'self.app.post' you'll see it referenced). Note: this link describes using .post with the following keywords: data and follow_redirects. It does not mention that you can also use the keywords content_type and headers. Perhaps the only reason that these keyword options are not intuitively obvious to me is because I'm new to this...
Does anyone know where documentation on these names resides? (I can't find it in flask/python/unittest documentation anywhere - perhaps I am looking in the wrong place?)
edit: with the help of the answers, I found this documentation.
For any Python Module, Class, Method (all of these in Python are object indeed), you can view the doc by:
>>> a_module.__doc__
>>> a_class.__doc__
>>> a_method.__doc__
To see more detailed documents, you can use help command:
>>> help(a_method)
You can always check the docstring of the method - comments that developers left when they wrote the code. You can check any object or method you need. For example:
Flask.__doc__
unittest.__doc__
dir.__doc__
dir.__doc__.__doc__
You can also query
Flask(__name__).test_client().post.__doc__
Flask(__name__).test_client().preserve_context.__doc__
But you'll notice that not all methods would be documented. For example:
Flask(__name__).test_client().open.__doc__
For more about this you can also see http://legacy.python.org/dev/peps/pep-0257/
Using help() gives you the same information but formatted, e.g.:
help(Flask)
help(unittest)
help(dir)
help(dir.__doc__)

BioPython Pubmed Eutils url?

I'm trying to run some queries against Pubmed's Eutils service. If I run them on the website I get a certain number of records returned, in this case 13126 (link to pubmed).
A while ago I bodged together a python script to build a query to do much the same thing, and the resultant url returns the same number of hits (link to Eutils result).
Of course, not having any formal programming background, it was all a bit cludgy, so I'm trying to do the same thing using Biopython. I think the following code should do the same thing, but it returns a greater number of hits, 23303.
from Bio import Entrez
Entrez.email = "A.N.Other#example.com"
handle = Entrez.esearch(db="pubmed", term="stem+cell[All Fields]",datetype="pdat", mindate="2012", maxdate="2012")
record = Entrez.read(handle)
print(record["Count"])
I'm fairly sure it's just down to some subtlety in how the url is being generated, but I can't work out how to see what url is being generated by Biopython. Can anyone give me some pointers?
Thanks!
EDIT:
It's something to do with how the url is being generated, as I can get back the original number of hits by modifying the code to include double quotes around the search term, thus:
handle = Entrez.esearch(db='pubmed', term='"stem+cell"[ALL]', datetype='pdat', mindate='2012', maxdate='2012')
I'm still interested in knowing what url is being generated by Biopython as it'll help me work out how i have to structure the search term for when i want to do more complicated searches.
handle = Entrez.esearch(db="pubmed", term="stem+cell[All Fields]",datetype="pdat", mindate="2012", maxdate="2012")
print(handle.url)
You've solved this already (Entrez likes explicit double quoting round combined search terms), but currently the URL generated is not exposed via the API. The simplest trick would be to edit the Bio/Entrez/__init__.py file to add a print statement inside the _open function.
Update: Recent versions of Biopython now save the URL as an attribute of the returned handle, i.e. in this example try doing print(handle.url)

Categories