How to make a sparql query with unicode letters? - python

I am querying the french dbpedia (http://fr.dbpedia.org/) with SPARQL.
I am using Python and SPARQLWrapper if it makes any difference.
This 1st query is working Ok.
PREFIX dbpp:<http://dbpedia.org/property/>
PREFIX dbpo:<http://dbpedia.org/ontology/>
PREFIX dbpr:<http://dbpedia.org/resource/>
SELECT ?wt ?summary ?source_url
WHERE {
?wt rdfs:label "Concerto"#fr .
OPTIONAL { ?wt dbpedia-owl:abstract ?summary . }
OPTIONAL { ?wt foaf:isPrimaryTopicOf ?source_url . }
filter (lang(?summary) = "fr" )
}
This 2nd query doesn't work.
PREFIX dbpp:<http://dbpedia.org/property/>
PREFIX dbpo:<http://dbpedia.org/ontology/>
PREFIX dbpr:<http://dbpedia.org/resource/>
SELECT ?wt ?summary ?source_url
WHERE {
?wt rdfs:label "Opéra"#fr .
OPTIONAL { ?wt dbpedia-owl:abstract ?summary . }
OPTIONAL { ?wt foaf:isPrimaryTopicOf ?source_url . }
filter (lang(?summary) = "fr" )
}
The only difference is the value of the label. The page http://fr.dbpedia.org/page/Opéra exists in dbpedia and rdfs label is set as "Opéra".
I think that the query doesn't work because it contains the french letter é. I've tried several escaping (Op%C3%A9re, Op\u0233ra, Op\xe9ra) without any success.
Any idea?

The problem is that the FILTER is not made optional. So it doesn't match <http://fr.dbpedia.org/resource/Opéra>, which has no dbpedia-owl:abstract.
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX dbpo: <http://dbpedia.org/ontology/>
PREFIX dbpr: <http://dbpedia.org/resource/>
SELECT ?wt ?summary ?source_url
WHERE {
?wt rdfs:label "Opéra"#fr .
OPTIONAL { ?wt dbpedia-owl:abstract ?summary .
filter (lang(?summary) = "fr" )
}
OPTIONAL { ?wt foaf:isPrimaryTopicOf ?source_url . }
}
... works (and returns <http://fr.dbpedia.org/resource/Catégorie:Opéra> as well).

Related

Inserting python variable in SPARQL

I have a string variable I want to pass in my SPARQL query and I can't get it to work.
title = 'Good Will Hunting'
[str(s) for s, in graph.query('''
PREFIX ddis: <http://ddis.ch/atai/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
SELECT ?lbl WHERE {
?movie rdfs:label $title#en .
?movie wdt:P57 ?director .
?director rdfs:label ?lbl .
}
''')]
It doesn't work and I get an error. The query is correct as it works if I manualy enter the name when I replace $title.
String interpolation in python can be achieved with the %s symbol (for string variables):
title = 'Good Will Hunting'
[str(s) for s, in graph.query('''
PREFIX ddis: <http://ddis.ch/atai/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
SELECT ?lbl WHERE {
?movie rdfs:label "%s"#en .
?movie wdt:P57 ?director .
?director rdfs:label ?lbl .
}
''' % title)]
Note that I also added quotes ("%s"), that are necessary for specifying a string in SPARQL.

Get entity name/label from wikidata in python

I have some SPARQL queries to run on wikidata in python and I need to get the name/label of the entity returned instead of URI. For example, given the python snippet below:
from qwikidata.sparql import return_sparql_query_results
query_string = """
select ?ent where { ?ent wdt:P31 wd:Q2637056 . ?ent wdt:P2244 ?obj } ORDER BY DESC(?obj)LIMIT 5
"""
res = return_sparql_query_results(query_string)
for row in res["results"]["bindings"]:
print(row["ent"]["value"])
The queries in the original form return URIs, but I need to get the entity label/name. How can I do that in python?
The current output of the query:
http://www.wikidata.org/entity/Q841796
http://www.wikidata.org/entity/Q780047
NOTE: I don't have real access to the queries, therefore I can't rewrite the queries.
My comment was too long so i am posting an answer.
You'll need to rewrite the queries. Please find below an example how to get labels without using the label service.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?country ?countryLabel
WHERE
{
# instance of country
?country wdt:P31 wd:Q3624078.
OPTIONAL {
?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en").
}
}
ORDER BY ?countryLabel
try it!
Adapted for your Soyuz-T example:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?ent ?entLabel
WHERE
{
# instance of Soyuz-T https://www.wikidata.org/wiki/Q2637056
?ent wdt:P31 wd:Q2637056 .
# https://www.wikidata.org/wiki/Property:P2244 periapsis
?ent wdt:P2244 ?obj
OPTIONAL {
?ent rdfs:label ?entLabel filter (lang(?entLabel) = "en").
}
} ORDER BY DESC(?obj)LIMIT 5
try it!
Result:
ent entLabel
wd:Q841796 Soyuz T-15
wd:Q780047 Soyuz T-8

Find information for many people in wikidata

I have a list of names (hundreds of them) that are already transformed to Q-numbers in wikidata using python. For each Q-number (person) I want to get some basic information such as place_of_birth, nationality, etc.
SELECT DISTINCT ?name ?nameLabel ?genderLabel ?placeofbirth ?nationality (year(?birthdate) as ?birthyear) (year(?deathdate) as ?deathyear)
WHERE
{
?name wdt:P106/wdt:P279* wd:Q1028181 # painter
FILTER (?name IN (wd:Q2674488)) # James Seymour
OPTIONAL { ?name wdt:P569 ?birthdate. }
OPTIONAL { ?name wdt:P27 ?nationality. }
OPTIONAL { ?name wdt:P21 ?gender. }
OPTIONAL { ?name wdt:P19 ?placeofbirth. }
OPTIONAL { ?name wdt:P570 ?deathyear. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Using SPARQL, I can search two or three people at a time by adding Q-numbers into "FILTER", but how can I loop through all Q-numbers in a python list? Thanks a lot!

SPARQLWrapper QueryBadFormed error for long SELECT query

Is there a limit on the size of the query you can use with with a SELECT query? I have a long SELECT query (posted below) that keeps throwing a QueryBadFormed error. I have validated the query on sparql.org and I have ran the query on the triple store. I am using GraphDB 8.6 SE. The query runs fine.
Code:
from SPARQLWrapper import SPARQLWrapper, SPARQLWrapper2, JSON, CSV, TSV
# set endpoint and query
endpoint = r"http://localhost:7200/repositories/EDR"
query = get_dental_procedures_query() # return query below
# get results from endpoint
sparql = SPARQLWrapper(endpoint)
setReturnFormat(JSON) # I've also tried CSV and TSV
sparql.setQuery(query)
results = sparql.query().convert()
Error returned:
SPARQLWrapper.SPARQLExceptions.QueryBadFormed: QueryBadFormed: a bad request has been sent to the endpoint, probably the sparql query is bad formed.
Here is the query:
BASE <http://purl.regenstrief.org/NDPBRN/dental-practice/>
PREFIX mesial_surface: <http://purl.obolibrary.org/obo/FMA_no_fmaid_Mesial_surface_enamel_of_tooth>
PREFIX exception: <http://purl.obolibrary.org/obo/OHD_0000404>
PREFIX ada_num: <http://purl.obolibrary.org/obo/OHD_0000065>
PREFIX occurrence_date: <http://purl.obolibrary.org/obo/OHD_0000015>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX resin_filling_proc: <http://purl.obolibrary.org/obo/OHD_0000042>
PREFIX birth_date: <http://purl.obolibrary.org/obo/OHD_0000050>
PREFIX restored_buccal: <http://purl.obolibrary.org/obo/OHD_0000222>
PREFIX caries_finding: <http://purl.obolibrary.org/obo/OHD_0000024>
PREFIX dental_finding: <http://purl.obolibrary.org/obo/OHD_0000010>
PREFIX molar: <http://purl.obolibrary.org/obo/FMA_55638>
PREFIX male_gender_role: <http://purl.obolibrary.org/obo/OMRSE_00000007>
PREFIX endodontically_restored_tooth: <http://purl.obolibrary.org/obo/0000236>
PREFIX root_canal_treatment: <http://purl.obolibrary.org/obo/OHD_0000230>
PREFIX has_part: <http://purl.obolibrary.org/obo/BFO_0000051>
PREFIX gender_role: <http://purl.obolibrary.org/obo/OMRSE_00000007>
PREFIX part_of: <http://purl.obolibrary.org/obo/BFO_0000050>
PREFIX inheres_in: <http://purl.obolibrary.org/obo/BFO_0000052>
PREFIX missing_tooth_finding: <http://purl.obolibrary.org/obo/OHD_0000026>
PREFIX pbrn_id: <http://purl.obolibrary.org/obo/OHD_0000273>
PREFIX distal_surface: <http://purl.obolibrary.org/obo/FMA_no_fmaid_Distal_surface_enamel_of_tooth>
PREFIX has_output: <http://purl.obolibrary.org/obo/OBI_0000299>
PREFIX occlusal_surface: <http://purl.obolibrary.org/obo/FMA_no_fmaid_Occlusal_surface_enamel_of_tooth>
PREFIX incisor: <http://purl.obolibrary.org/obo/FMA_12823>
PREFIX graph: <http://purl.regenstrief.org/NDPBRN/dental-practice#>
PREFIX patient_role: <http://purl.obolibrary.org/obo/OHD_0000190>
PREFIX anterior_tooth: <http://purl.obolibrary.org/obo/OHD_0000307>
PREFIX resin: <http://purl.obolibrary.org/obo/OHD_0000036>
PREFIX restored_lingual: <http://purl.obolibrary.org/obo/OHD_0000226>
PREFIX dental_proc: <http://purl.obolibrary.org/obo/OHD_0000002>
PREFIX restored_surface: <http://purl.obolibrary.org/obo/OHD_0000208>
PREFIX extracoronally_restored_tooth: <http://purl.obolibrary.org/obo/0000238>
PREFIX lingual_surface: <http://purl.obolibrary.org/obo/FMA_no_fmaid_Lingual_surface_enamel_of_tooth>
PREFIX dentition: <http://purl.obolibrary.org/obo/FMA_75152>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX lesion: <http://purl.obolibrary.org/obo/OHD_0000021>
PREFIX labial_surface: <http://purl.obolibrary.org/obo/FMA_no_fmaid_Labial_surface_enamel_of_tooth>
PREFIX has_input: <http://purl.obolibrary.org/obo/OBI_0000293>
PREFIX posterior_tooth: <http://purl.obolibrary.org/obo/OHD_0000308>
PREFIX extraction_proc: <http://purl.obolibrary.org/obo/OHD_0000057>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX restored_occlusal: <http://purl.obolibrary.org/obo/OHD_0000228>
PREFIX is_about: <http://purl.obolibrary.org/obo/IAO_0000136>
PREFIX restored_labial: <http://purl.obolibrary.org/obo/OHD_0000225>
PREFIX coronally_restored_tooth: <http://purl.obolibrary.org/obo/0000237>
PREFIX patient: <http://purl.obolibrary.org/obo/OHD_0000012>
PREFIX prop: <http://purl.regenstrief.org/NDPBRN/property/>
PREFIX restoration_proc: <http://purl.obolibrary.org/obo/OHD_0000004>
PREFIX last_visit_date: <http://purl.obolibrary.org/obo/OHD_0000219>
PREFIX tooth: <http://purl.obolibrary.org/obo/FMA_12516>
PREFIX intracoronally_restored_tooth: <http://purl.obolibrary.org/obo/0000239>
PREFIX bearer_of: <http://purl.obolibrary.org/obo/BFO_0000053>
PREFIX first_visit_date: <http://purl.obolibrary.org/obo/OHD_0000218>
PREFIX surgically_modified_tooth: <http://purl.obolibrary.org/obo/0000231>
PREFIX canine: <http://purl.obolibrary.org/obo/FMA_55636>
PREFIX facial_surface: <http://purl.obolibrary.org/obo/FMA_no_fmaid_Facial_surface_enamel_of_tooth>
PREFIX restored_distal: <http://purl.obolibrary.org/obo/OHD_0000223>
PREFIX premolar: <http://purl.obolibrary.org/obo/FMA_55637>
PREFIX restored_tooth: <http://purl.obolibrary.org/obo/OHD_0000189>
PREFIX restored_facial: <http://purl.obolibrary.org/obo/OHD_0000235>
PREFIX material: <http://purl.obolibrary.org/obo/OHD_0000000>
PREFIX missing_tooth_num: <http://purl.obolibrary.org/obo/OHD_0000234>
PREFIX buccal_surface: <http://purl.obolibrary.org/obo/FMA_no_fmaid_Buccal_surface_enamel_of_tooth>
PREFIX realizes: <http://purl.obolibrary.org/obo/BFO_0000055>
PREFIX female_gender_role: <http://purl.obolibrary.org/obo/OMRSE_00000008>
PREFIX restored_mesial: <http://purl.obolibrary.org/obo/OHD_0000227>
PREFIX restored_incisal: <http://purl.obolibrary.org/obo/OHD_0000224>
PREFIX visit: <http://purl.obolibrary.org/obo/OHD_0000009>
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX incisal_surface: <http://purl.obolibrary.org/obo/FMA_no_fmaid_Incisal_surface_enamel_of_tooth>
SELECT DISTINCT ?practice ?patient_id ?gender ?dob ?first_visit ?last_visit ?tooth_id ?tooth_num ?first_PCR ?first_RCT ?event_name ?ada_code ?event_date ?extract_date ?missing_date (if(bound(?surface_m), 1, 0) AS ?m) (if(bound(?surface_o), 1, 0) AS ?o) (if(bound(?surface_d), 1, 0) AS ?d) (if(bound(?surface_b), 1, 0) AS ?b) (if(bound(?surface_l), 1, 0) AS ?l) (if(bound(?surface_f), 1, 0) AS ?f) (if(bound(?surface_incisal), 1, 0) AS ?i)
WHERE
{ ?patient_i a patient: ;
birth_date: ?dob ;
pbrn_id: ?pbrn_id
OPTIONAL
{ ?gender_t rdfs:subClassOf male_gender_role: ;
rdfs:label ?gender_name .
?gender_i sesame:directType ?gender_t ;
inheres_in: ?patient_i
}
?patient_i first_visit_date: ?first_visit ;
last_visit_date: ?last_visit .
?tooth_t rdfs:subClassOf tooth: ;
ada_num: ?ada_num .
?tooth_i sesame:directType ?tooth_t
OPTIONAL
{ ?tooth_i prop:first_PCR_date ?first_PCR }
OPTIONAL
{ ?tooth_i prop:first_RCT_date ?first_RCT }
OPTIONAL
{ ?tooth_i prop:extraction_date ?extract_date }
OPTIONAL
{ ?tooth_i prop:missing_tooth_finding_date ?missing_date }
?event_t rdfs:subClassOf dental_proc: ;
rdfs:label ?event_name .
?event_i sesame:directType ?event_t ;
has_input: ?patient_i ;
has_output: ?tooth_i ;
occurrence_date: ?event_date ;
prop:ada_code ?ada_code
OPTIONAL
{ ?event_i has_output: ?surface_m .
?surface_m sesame:directType restored_mesial: ;
part_of: ?tooth_i
}
OPTIONAL
{ ?event_i has_output: ?surface_o .
?surface_o sesame:directType restored_occlusal: ;
part_of: ?tooth_i
}
OPTIONAL
{ ?event_i has_output: ?surface_d .
?surface_d sesame:directType restored_distal: ;
part_of: ?tooth_i
}
OPTIONAL
{ ?event_i has_output: ?surface_b .
?surface_b sesame:directType restored_buccal: ;
part_of: ?tooth_i
}
OPTIONAL
{ ?event_i has_output: ?surface_l .
?surface_l sesame:directType restored_lingual: ;
part_of: ?tooth_i
}
OPTIONAL
{ ?event_i has_output: ?surface_f .
?surface_f sesame:directType restored_facial: ;
part_of: ?tooth_i
}
OPTIONAL
{ ?event_i has_output: ?surface_incisal .
?surface_incisal
sesame:directType restored_incisal: ;
part_of: ?tooth_i
}
BIND(strafter(str(?tooth_i), "tooth/") AS ?tooth_id)
BIND(strafter(str(?patient_i), "patient/") AS ?patient_id)
BIND(strbefore(str(?gender_name), " ") AS ?gender)
BIND(strafter(str(?ada_num), "Tooth ") AS ?tooth_num)
BIND(strafter(str(?pbrn_id), "NDPBRN practice ") AS ?practice)
}
limit 5
Use the POST HTTP method for long queries:
sparql.setMethod('POST')
More info: https://www.w3.org/TR/sparql11-protocol/#query-operation
In your case, limitation appears to be urllib2-related. It also seems that the above approach shouldn't work with SPARQLWrapper2().

Same sparql not returning same results

I'm using the same sparql statement using two different clients but both are not returning the same results. The owl file is in rdf syntax and can be accessed here.
This is the sparql statement:
PREFIX wo:<http://purl.org/ontology/wo/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> select ?individual where { ?individual rdf:type wo:Class }
I'm using it using top braid and the following python program:
>>> import rdflib
>>> import rdfextras
>>> rdfextras.registerplugins()
>>> g=rdflib.Graph()
>>> g.parse("index.owl")
<Graph identifier=N39ccd52985014f15b2fea90c3ffaedca (<class 'rdflib.graph.Graph'>)>
>>> PREFIX = "PREFIX wo:<http://purl.org/ontology/wo/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> "
>>> query = "select ?individual where { ?individual rdf:type wo:Class }"
>>> query = PREFIX + query
>>> result_set = g.query(query)
>>> len(result_set)
0
Which is returning 0
This query constructs a graph containing all the triples in which wo:Class is used as a subject, predicate, or object:
PREFIX wo: <http://purl.org/ontology/wo/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
construct { ?s ?p ?o }
where {
{ ?s ?p wo:Class . bind( wo:Class as ?o ) } union
{ ?s wo:Class ?o . bind( wo:Class as ?p ) } union
{ wo:Class ?p ?o . bind( wo:Class as ?s ) }
}
I made a local copy of your data and the results I get are (in Turtle):
#prefix vs: <http://www.w3.org/2003/06/sw-vocab-status/ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix wo: <http://purl.org/ontology/wo/> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
wo:Class a owl:Class ;
rdfs:comment "A class is a scientific way to group related organisms together, some examples of classes being jellyfish, reptiles and sea urchins. Classes are big groups and contain within them smaller groupings called orders, families, genera and species."#en ;
rdfs:label "Class"#en ;
rdfs:seeAlso <http://www.bbc.co.uk/nature/class> , <http://en.wikipedia.org/wiki/Class_%28biology%29> ;
rdfs:subClassOf wo:TaxonRank ;
vs:term_status "testing" .
wo:class rdfs:range wo:Class .
There are no individuals of type wo:Class in your data. The result set ought to be empty.

Categories