How to get Dbpedia resource details using Jena? - sparql

After I query dbpedia over its Sparql endpoint, I get results as Jena ResourceImpl objects. Then how I can get details of this resource? For example if this resource is a person; how can I get his/her birthDate?
I tried this one; but it always returns null.
QuerySolution querySolution = resultSet.next();
RDFNode x = querySolution.get("x");
ResourceImpl resource = (ResourceImpl) x;
Property property = new PropertyImpl("http://dbpedia.org/property/birthDate");
Resource propertyResourceValue = resource.getPropertyResourceValue(property); // NULL

Likely you will need to make a subsequent SPARQL query if you want to get further details about the resource. E.g.,
String nextQuery = "DESCRIBE " + FmtUtils.stringForNode(resource.asNode(), (SerializationContext)null);
Query describeQuery = QueryFactory.create(nextQuery);
QueryExecution exec = QueryExecutionFactory.sparqlService("http://endpoint", describeQuery);
Model m = exec.execDescribe();
You should then be able to use the resource API over the resulting model to get the information you want.

Assuming your SPARQL query is a SELECT ..., the ResultSet is a table and each QuerySolution is a row in that table. When you get the Resource from such a row, you only have a Resource; the properties are not automatically attached. Hence, getting a property value on the Resource returns null.
It looks like RDFOutput does what you expected: transform a SPARQL query ResultSet to an RDF Model.

Related

BigQuery cacheHit property

I'm using the BigQuery API to run a query with the following code:
query = (
'SELEC ...'
)
# API request - starts the query
query_job = client.query(
query,
location='US'
)
results = query_job.result()
The query works and outputs expected results.
However, I am not able to verify use of the cache.
Docs:
If you are using the BigQuery API, the cacheHit property in the query
result is set to true.
I am trying to access results.cacheHit, but it does work out.
AttributeError: 'RowIterator' object has no attribute 'cacheHit'
What am I doing wrong? How can I see the use of cache with my query?
The quote you are using from docs refers to the REST API (cacheHit is in the response of the getQueryResults method).
What you need instead is query_job.cache_hit

SPARUL query to drop most graphs, using Jena API

I am trying to clear most of the graphs contained in my local Virtuoso triple store, using Apache Jena, as part of my clean up process before and after my unit tests. I think that something like this should be done. First, I retrieve the graph URIs to be deleted; then I execute a SPARUL Drop operation.
String sparqlEndpointUsername = ...;
String sparqlEndpointPassword = ...;
String sparqlQueryString = ...; // Returns the URIs of the graphs to be deleted
HttpAuthenticator authenticator = new SimpleAuthenticator(sparqlEndpointUsername,
sparqlEndpointPassword.toCharArray());
ResultSet resultSetToReturn = null;
try (QueryEngineHTTP queryEngine = new QueryEngineHTTP(sparqlEndpoint, sparqlQueryString, authenticator)) {
resultSetToReturn = queryEngine.execSelect();
resultSetToReturn = ResultSetFactory.copyResults(resultSetToReturn);
while(resultSetToReturn.hasNext()){
String graphURI = resultSetToReturn.next().getResource("?g").getURI();
UpdateRequest request = UpdateFactory.create() ;
request.add("DROP GRAPH <"+graphURI+">");
Dataset dataset = ...; // how can I create a default dataset pointing to my local virtuoso installation?
// And perform the operations.
UpdateAction.execute(request, dataset) ;
}
}
;
Questions:
As shown in this example, ARQ needs a dataset to operate on. How would I create this dataset pointing to my local Virtuoso installation for an update operation?
Is there perhaps an alternative to my approach? Would using another approach (apart from jena) be a better idea?
Please note that I am not trying to delete all graphs. I am deleting only the graphs whose names are returned through the SPARQL query defined in the beginning (3rd line).
Your question appears to be specific to Virtuoso, and meant to remove all RDF data, so you could use Virtuoso's built-in RDF_GLOBAL_RESET() function.
This is not a SPARQL/SPARUL query; it is usually issued through an SQL connection -- which could be JDBC, ODBC, ADO.NET, OLE DB, iSQL, etc.
That said, as you are connecting through a SPARUL-privileged connection, you should be able to use Virtuoso's (limited) SQL-in-SPARQL support, a la --
SELECT
( bif:RDF_GLOBAL_RESET() AS reset )
WHERE
{ ?s ?p ?o }
LIMIT 1
(Executing this through an unprivileged connection like the default SPARQL endpoint will result in an error like Virtuoso 37000 Error SP031: SPARQL compiler: Function bif:RDF_GLOBAL_RESET() can not be used in text of SPARQL query due to security restrictions.)
(ObDisclaimer: OpenLink Software produces Virtuoso, and employs me.)
You can build a single SPARQL Update request:
DROP GRAPH <g1> ;
DROP GRAPH <g2> ;
DROP GRAPH <g3> ;
... ;
because in SPARQL Update one HTTP requests can be several update operations, separated by ;.

How to assign parameter values in a SparqlParameterizedString

I'm playing around some with Dotnetrdf's sparql engine and I'm trying to create parametered queries with no success yet.
Say I'm working on a graph g with a blank node identified as _:1690 with the code
Dim queryString As SparqlParameterizedString = New SparqlParameterizedString()
queryString.Namespaces.AddNamespace("rdfs", UriFactory.Create("http://www.w3.org/2000/01/rdf-schema#"))
queryString.CommandText = "SELECT ?label { #context rdfs:label ?label } "
queryString.SetParameter("context", g.GetBlankNode("1690"))
Dim result As VDS.RDF.Query.SparqlResultSet = g.ExecuteQuery(New SparqlQueryParser().ParseFromString(queryString))
Whenever I run this, I get all nodes having a rdfs:label property instead of filtering the result on my blank node only.
Please, how to set the parameter's value properly so I get only one item in the result ?
Thanks in advance,
Max.
Blank Nodes in a SPARQL query differ from Blank Nodes in a RDF Graph
In a SPARQL Query a blank node is treated as a temporary variable which has limited scope, it does not match a specific blank node so you cannot write a SPARQL query to select by blank node identifier.
So your code creating your query is giving a result the same as if you replaced #context with a variable e.g. ?s
If you need to find the value associated with a specific blank node then you need to formulate a query that uniquely selects that blank node based on the triples it participates in. If you can't do that then you need to re-think your data modelling since if this is the case then you should likely be using URIs instead of blank nodes.
As a workaround since you are using dotNetRDF and have the original graph you are querying you can use the IGraph API instead e.g.
INode label = g.GetTriplesWithSubjectPredicate(g.GetBlankNode("1690"), g.CreateUriNode("rdfs:label")).Select(t => t.Object).FirstOrDefault();
Just remember that label could always be null if the triple you are looking for doesn't exist

I am executing query using WildcardQuery of Lucene,but it doesn't work

I am executing query using WildcardQuery of Lucene.but I don't know why the result cannot be found.
Below are the details.
Here is the code for create WildcardQuery,and The record of Field Name :'Full Name' Value:'ABC123DD456CC' is existed Index Document.
BooleanQuery booleanQuery = new BooleanQuery();
for (IndexQueryField field : quickSearchFields)
{
Query query = new WildcardQuery(new Term(queryField.getFieldName(),"ABC*DD*CC"));
booleanQuery.add(query, BooleanClause.Occur.SHOULD);
}
The part of code: Executing query:
Session hibernateSession = (Session) em.getDelegate();
FullTextSession session = SwitchSession.getFullTextSession(hibernateSession, specifyIndexName);
// Set Hibernate flushMode
session.setFlushMode(FlushMode.MANUAL);
// Ignore Hibernate Cache
session.setCacheMode(CacheMode.IGNORE);
FullTextQuery query = session.createFullTextQuery(booleanQuery,XXX.class);
List list = query.setFirstResult(1).setMaxResults(100).list();
The list is empty, i am sure the 'ABC123DD456CC' is existed in Lucene Document.
I just want to do it with WildcardQuery. Any help will be thankful!
I believe that last line should be:
List list = query.setFirstResult(0).setMaxResults(100).list();
Since results are numbered from 0. If there is only 1 document matching that search, which seems likely enough, that probably explains why you're getting nothing (having skipped the first and only result, at index 0).

Advanced search with distances using NHibernate and SQL Server Geography

I've got an existing advanced search method in a repository that checks a FormCollection for the existence of search criteria, and if present, adds a criterion to the search e.g.
public IList<Residence> GetForAdvancedSearch(FormCollection collection)
{
var criteria = Session.CreateCriteria(typeof(Residence))
.SetResultTransformer(new DistinctRootEntityResultTransformer());
if (collection["MinBedrooms"] != null)
{
criteria
.Add(Restrictions.Ge("Bedrooms", int.Parse(collection["MinBedrooms"])));
}
// ... many criteria omitted for brevity
return criteria.List<Residence>();
}
I've also got a basic distance search to find how far each residence is from the search criteria. The HBM for the query is
<sql-query name="Residence.Nearest">
<return alias="residence" class="Residences.Domain.Residence, Residences"/>
<return-scalar column="Distance" type="float"/>
SELECT R.*, dbo.GetDistance(:point, R.Coordinate) AS Distance
FROM Residence R
WHERE Distance < 10
ORDER BY Distance
</sql-query>
I had to define a function to calculate the distance, as there was no way to get NHibernate to escape the colons in the geography function:
CREATE FUNCTION dbo.GetDistance
(
#firstPoint nvarchar(100),
#secondPoint GEOMETRY
)
RETURNS float
AS
BEGIN
RETURN GEOGRAPHY::STGeomFromText(
#firstPoint, 4326).STDistance(#secondPoint.STAsText()) / 1609.344
END
And the repository calls the named query thus:
return Session
.GetNamedQuery("Residence.Nearest")
.SetString("point", String.Format("POINT({0} {1})", latitude, longitude))
.List();
So my question is; how do I combine the two (or start from scratch), so I can filter the advanced search results to include only residences within 10 miles of the search location?
UPDATE I have tried using NHibernate.Spatial with the following code:
criteria.Add(SpatialExpression.IsWithinDistance(
"Coordinate", new Coordinate(latitude, longitude), 10));
but SpatialExpression.IsWithinDistance returned a System.NotImplementedException.
Have you seen the NHibernate.Spatial project? This may provide an easy solution to your problem.
The alternative is to create your own implementation of ICriterion - this is not too tricky if you derive from AbstractCriterion and you target your particular database platform. This would then allow you to combine your distance function with other criteria.
Create a projection that, in effect, adds a new distance column to the results, which is calculated by called the UDF, and then add a restriction to it:
var query = String.Format(
"dbo.GetDistance('POINT({0} {1}', Coordinate) AS Distance",
latitude, longitude);
criteria
.Add(Restrictions.Le(Projections.SqlProjection(
query,
new [] {"Distance"},
new [] {NHibernateUtil.Double}), 10));
UPDATE
n.b. Although this must have worked when I posted it, it doesn't work any more. NHibernate doesn't like the '.' after dbo, and says
"could not resolve property: dbo of: Residences.Domain.Residence".
If I remove the 'dbo.' I get
"'GetDistance' is not a recognized built-in function name."