rdflib's parseQuery decode the query string which cause invalid URI - python

I have the following ttl file:
#prefix : <https://www.example.co/reserved/language#> .
<https://www.example.co/reserved/root> :_id "01G39WKRH76BGY5D3SKDHJP2SX" ;
:transcript%20data [ :_id "01G39WKRH7JYRX78X7FG4RCNYF" ;
:_key "transcript%20data" ;
:value "value" ;
:value_id "01G39WKRH7PVK1DXQHWT08DZA8" ] .
And I have the following query:
q = """
PREFIX : <https://www.example.co/reserved/language#>
SELECT ?o
WHERE { ?s :transcript%20data/:value ?o . }
"""
While trying to query the graph I got from the ttl file I got the following error:
https://www.example.co/reserved/language#transcript data does not look like a valid URI, trying to serialize this will break.
As you see, parseQuery has decoded the "%20" to a space " " which cases invalid URI. And this will return False while passed to _is_valid_uri function.
I've tested the query on different SPARQL engines and it is valid and works as expected.
So, what do you advise? to make the query valid and get the required results?
I am using rdflib Version: 6.1.1 on macOS Monterey 12.4

It was a bug in rdflib in SPARQL parser and it is fixed in this PR
Seems like _hexExpand internal SPARQL parser function inappropriately expands percent-encoded reserved characters. Added an exclusionary regexp to disable this behaviour and a parameterized test which checks SPARQL parser processing of the set of percent-encoded reserved chars

Related

"KeyError: rdflib.term.BNode" Error appeared when executing SPARQL query

I'm trying to retrieve all intersection members for a specific class from a .owl ontology using SPARQL. I executed the following SPARQL query :
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
select ?class ?i ?uri ?label where {
?class owl:equivalentClass/
owl:intersectionOf/
rdf:rest*/rdf:first ?i.
?uri rdfs:label ?label.
FILTER (?uri IN (<http://purl.obolibrary.org/obo/AGRO_00000002>) )
}
When executed it, I got an error :
"KeyError: rdflib.term.BNode('510')"
I'm using Python , Pycharm framework. For executing the SPARQL, I used both rdflib and owlready2. Can you please help me solve the error mentioned above?

Selecting literal values from Wikidata federated query service using RDFLib

I'm trying to get external identifiers for an entity in Wikidata. Using the following query, I can get the literal values (_value) and optionally formatted URLs (value) for Q2409 on the Wikidata Query Service site.
Load in Wikidata Query Service
SELECT ?property ?_value ?value
WHERE {
?property wikibase:propertyType wikibase:ExternalId .
?property wikibase:directClaim ?propertyclaim .
OPTIONAL { ?property wdt:P1630 ?formatterURL . }
wd:Q2409 ?propertyclaim ?_value .
BIND(IF(BOUND(?formatterURL), IRI(REPLACE(?formatterURL, "\\$", ?_value)) , ?_value) AS ?value)
}
Using RDFLib, I'm writing the same query, but with a federated service.
from rdflib import Graph
from rdflib.plugins.sparql import prepareQuery
g = Graph()
q = prepareQuery(r"""
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
SELECT ?property ?_value ?value
WHERE {
SERVICE <https://query.wikidata.org/sparql> {
?property wikibase:propertyType wikibase:ExternalId .
?property wikibase:directClaim ?propertyclaim .
OPTIONAL { ?property wdt:P1630 ?formatterURL . }
wd:Q2409 ?propertyclaim ?_value .
BIND(IF(BOUND(?formatterURL), IRI(REPLACE(?formatterURL, "\\$", ?_value)) , ?_value) AS ?value)
}
}
""")
for row in g.query(q, DEBUG=True):
print(row)
With this, I'm getting the URLs as URIRef objects. But, instead of Literal for the literal values, I'm getting None.
First 6 lines of output:
(rdflib.term.URIRef('http://www.wikidata.org/entity/P232'), None, None)
(rdflib.term.URIRef('http://www.wikidata.org/entity/P657'), None, None)
(rdflib.term.URIRef('http://www.wikidata.org/entity/P6366'), None, None)
(rdflib.term.URIRef('http://www.wikidata.org/entity/P1296'), None, rdflib.term.URIRef('https://www.enciclopedia.cat/EC-GEC-01407541.xml'))
(rdflib.term.URIRef('http://www.wikidata.org/entity/P486'), None, rdflib.term.URIRef('https://id.nlm.nih.gov/mesh/D0068511.html'))
(rdflib.term.URIRef('http://www.wikidata.org/entity/P7033'), None, rdflib.term.URIRef('http://vocabulary.curriculum.edu.au/scot/5001.html'))
What am I missing for the literal values? I'm having trouble figuring out why I'm getting None instead of the values.
I'm not sure if all of the features of SERVICE calls are fully implemented in RDFLib.
I would get this working with a 'normal' call the Wikidata SPARQL endpoint using either RDFLib's SPARQLWrapper library or the general-purpose web request Python libraries requests or httpx first. If that all works, you could then try again with the SERVICE request but you likely won't need it.

Is there any way i can run cypher command starting from :Param using py2neo 2021.0.0

Well, in neo4j i am trying to achieve this simple query to save the sparql keyword to use in later query and graph.run is not allowing me to do it. It is showing a syntax error
graph.run(":PARAM sparql: 'PREFIX sch: <http://schema.org/> CONSTRUCT{?item a sch:item; sch:legalIdentity ?legalIdentity} WHERE { {?item p:P31/ps:P31 wd:Q783794 optional { ?item wdt:P1278 ?legalIdentity} } UNION {?item p:P31/ps:P31 wd:Q4830453 optional { ?item wdt:P1278 ?legalIdentity}} UNION {?item p:P31/ps:P31 wd:Q43229 optional { ?item wdt:P1278 ?legalIdentity}} UNION {?item p:P31/ps:P31 wd:Q6881511 optional { ?item wdt:P1278 ?legalIdentity}}}'")
And following line is the cypher query which uses sparql keyword
graph.run('CALL n10s.rdf.import.fetch("https://query.wikidata.org/sparql?query=" + apoc.text.urlencode($sparql), "RDF/XML", { headerParams: { Accept: "application/rdf+xml"} });')
The :PARAM command is a client-side browser/shell built-in. It does not exist in Cypher itself. As mentioned by #fbiville, you will need to pass a dict of parameters instead.
You can pass a dictionary of parameters to the run method, as documented here.

SPARQLWrapper : problem in querying an ontology in a local file

I'm working with SPARQLWrapper and I'm following the documentation. Here is my code:
queryString = "SELECT * WHERE { ?s ?p ?o. }"
sparql = SPARQLWrapper("http://example.org/sparql")# I replaced this line with
sparql = SPARQLWrapper("file:///thelocation of my file in my computer")
sparql.setQuery(queryString)
try :
ret = sparql.query()
# ret is a stream with the results in XML, see <http://www.w3.org/TR/rdf-sparql-XMLres/>
except :
deal_with_the_exception()
I'm getting these 2 errors:
1- The system cannot find the path specified
2- NameError: name 'deal_with_the_exception' is not defined
You need a SPARQL endpoint to make it work. Consider setting up Apache Fuseki in your local computer. See https://jena.apache.org/documentation/fuseki2/jena

SPARQL Join ttl to dbpedia in Python

So I know that in order to run SPARQL statements against a local ttl file I use rdflib. In order to run SPARQL statements against dbpedia I run Sparqlwrapper. But how do I do both? i.e. suppose I have a local ttl file and I want to leverage some of the online resources available.
So ... suppose I have the following local ttl
#prefix foaf: <http://xmlns.com/foaf/0.1/> .
<http://www.learningsparql.com/ns/demo#i93234>
foaf:nick "Dick" ;
foaf:givenname "Richard" ;
foaf:mbox "richard49#hotmail.com" ;
foaf:surname "Mutt" ;
foaf:workplaceHomepage <http://www.philamuseum.org/> ;
foaf:aimChatID "bridesbachelor" .
Then I create the following python program to execute a SPARQL query and print out more human readable versions of the properties
filename = "C:/DataStuff/SemanticOntology/LearningSPARQLExamples/ex050.ttl" interesting
import rdflib
g = rdflib.Graph()
result = g.parse(filename, format='ttl')
print(result)
query = """
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?propertyLabel ?value
WHERE
{
?s ?property ?value .
?property rdfs:label ?propertyLabel .
}
"""
results=g.query(query)
print('Results!')
for row in results:
print(row)
Which will return nothing because it isn't accessing dbpedia, and therefore doesn't know what rdfs:label is. I get that. But how do I tell it to?

Categories