I'm trying to map tabular data into RDF using RML mapping.
I've figured out how to define and use prefixes for rr:constant definitions as indicated with the <<--- arrows in the code below. I've also figured out how to map column values using rr:template, as indicated with the <<<<<< arrow in the same code.
#prefix rr: <http://www.w3.org/ns/r2rml#>.
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
#prefix ex: <http://example.org/>.
[...]
:map_001 rr:predicateObjectMap [
rr:predicate rdf:type; # <<---
rr:objectMap [
rr:constant ex:MyClass; # <<---
rr:termType rr:IRI
]
];
rr:predicateObjectMap [
rr:predicate ex:myPredicate; # <<---
rr:objectMap [
rr:template "http://example.org/{some_column}" # <<<<<<
]
].
My question is: is it possible to somehow use prefixes in rr:template definitions, in order to not have to explicitly write the complete base URI? For example, I'd like to do something like what is shown below, although this clearly doesn't work:
:map_001 rr:predicateObjectMap [
rr:predicate ex:myPredicate;
rr:objectMap [
rr:template ex:"{some_column}" # <<--- DOESN'T WORK!
]
].
Is there any syntax for this, or is it simply not possible?
I'm playing around with R2RML and I was wondering if I can create a property depending on the content of a RDB table cell.
The D2RQ mapping language has d2rq:condition that can handle that.
e.g.:
if value in column/table cell 'name' is 'abc' create property 'abc'
rr:predicateObjectMap [
rr:predicate ex:abc
rr:objectMap [
rr:column "name";
rr:datatype xsd:string;
# equivalent for d2rq:condition "name='abc'"
];
]
if value in column/table cell 'name' is 'xyz' create property 'xyz'
rr:predicateObjectMap [
rr:predicate ex:xyz
rr:objectMap [
rr:column "name";
rr:datatype xsd:decimal;
# equivalent for d2rq:condition "name='xyz'"
];
];
I couldn't find any suggestion in W3C's R2RML Recommendation.
Any ideas? :-)
Update:
I had the idea of using rr:sqlQuery
e.g.
rr:SQLQuery """
select (case TABLENAME.COLUMNNAME
when 'this' then 'propertyOne'
when 'that' then 'propertyTwo'
end) as VARIABLE_PREDICATE
from TABLENAME """;
and apply it to a rr:predicate or rr:predicateMap with
rr:predicateObjectMap [
rr:predicateMap [ rr:template "ex:{VARIABLE_PREDICATE}" ];
rr:objectMap [ rr:column "COLUMNNAME"; ];
];
But that didn't work. I guess predicateMaps can be rr:constants only and not rr:templates :( . At least the W3C Recommendation just shows constants within predicateMap.
Still searching for a solution... :/
P.S. I'm disappointed that a proprietary language like d2rq seems to be more powerful (at this point).
R2RML doesn't have conditional properties (like in D2RQ). The design was done on purpose in order not to complicate the language. Any type of "complex" mapping requires SQL.
A solution is the following:
#prefix rr: <http://www.w3.org/ns/r2rml#>.
<#Mapping> a rr:TriplesMap;
rr:logicalTable [ rr:SQLQuery """
select id, COLUMNNAME, (case TABLENAME.COLUMNNAME
when 'this' then 'http://ex.com/propertyOne'
when 'that' then 'http://ex.com/propertyTwo'
end) as VARIABLE_PREDICATE
from TABLENAME """; ];
rr:subjectMap [
rr:template "http://ex.com/foo/{id}";
];
rr:predicateObjectMap [
rr:predicateMap [ rr:column "VARIABLE_PREDICATE" ];
rr:objectMap [ rr:column "COLUMNNAME" ];
].
We routinely do that in mapping the Getty vocabs, for props that depend on key values (flags). Eg
<#ContribTermRelPreferred>
a rr:TriplesMap;
rr:logicalTable [ rr:sqlQuery """
SELECT ...
UDF_LOD_LOOKUP_PROPERTY('contrib_rels_term','preferred',CRT.PREFERRED) CONTRIBPREF
""" ];
rr:predicateObjectMap [
rr:predicateMap [ rr:column "CONTRIBPREF" ];
rr:objectMap [ rr:template "http://vocab.getty.edu/aat/contrib/{CONTRIB_ID}" ];
].
it sure seems that there's no easy way to do this ... how can i ensure that certain fields in my multi match query going to actually be boosted correctly so that exact matches show up at the top?
i honestly seem to have tried this a multitude of ways, but maybe someone knows the answer ...
in my movie and music database, i'm trying to search multiple fields at once, but ensure that exact matches make it to the top and that certain fields such as title and artist name have more boost.
here's the main portion of my query...
"query": {
"bool": {
"should": [
{
"multi_match": {
"type": "phrase_prefix",
"query": "brave",
"max_expansions": 10,
"fields": [
"title^3",
"artists.name^2",
"starring.name^2",
"credits.name",
"tracks^0.1"
]
}
}
],
"minimum_number_should_match": 1
}
}
as you see, the query is 'brave'. it just so happens there's a movie called brave. perfect, i want it at the top - since not only is it an exact match, but the match is in the title. however, there's a popular song called 'brave' from sara bareilles which ends up on top. why?
i've tried every analyzer known to man, custom and otherwise, and i've tried changing the 'type' parameter to every other permutation (phrase, best_fields, cross_fields, most_fields), and it just doesn't seem to honor the fact that i'm effectively trying to either promote 'title' and 'artists.name' and 'starring.name' and DEMOTE 'tracks'.
is there any way i can ensure all exact matches show up at the top (especially in title, etc) followed by expansions, etc?
any suggestions would be helpful.
EDIT
the analyzer i'm currently using which seems to work better than others is a custom one i call 'nameAnalyzer' which is made up of a 'lowercase' filter and 'keyword' tokenizer only.
here's some example documents in the order in which they're appearing in the results:
fields": {
"title": [
"Brave"
],
"credits.name": [
"Kelly MacDonald",
"Emma Thompson",
"Billy Connolly",
"Julie Walters",
"Kevin McKidd",
"Craig Ferguson",
"Robbie Coltrane"
],
"starring.name": [
"Emma Thompson",
"Julie Walters",
"Billy Connolly",
"Kevin Mckidd",
"Kelly Macdonald"
]
,
fields": {
"credits.name": [
"Hilary Weeks",
"Scott Wiley",
"Sarah Sample",
"Debra Fotheringham",
"Dustin Christensen",
"Russ Dixon"
],
"title": [
"Say Love"
],
"artists.name": [
"Hilary Weeks"
],
"tracks": [
"Say Love",
"Another Second Chance",
"It's A Good Day",
"Brave",
"I Found Me",
"Hero",
"Tell Me",
"Where I Am",
"Better Promises",
"Even When"
]
,
fields": {
"title": [
"Brave Little Toaster"
],
"credits.name": [
"Randy Bennett",
"Jim Jackman",
"Randy Cook",
"Judy Toll",
"Jon Lovitz",
"Tim Stack",
"Timothy E. Day",
"Thurl Ravenscroft",
"Deanna Oliver",
"Phil Hartman",
"Jonathon Benair",
"Joe Ranft"
],
"starring.name": [
"Jon Lovitz",
"Thurl Ravenscroft",
"Tim Stack",
"Timothy E. Day",
"Deanna Oliver"
]
},
"fields": {
"title": [
"Braveheart"
],
"credits.name": [
"Bernard Horsfall",
"Martin Dempsey",
"James Robinson",
"Robert Paterson",
"Alan Tall",
"Rupert Vansittart",
"Donal Gibson",
"Malcolm Tierney",
"Sandy Nelson",
"Sean Lawlor"
],
"starring.name": [
"Brendan Gleeson",
"Sophie Marceau",
"Mel Gibson",
"Patrick Mcgoohan",
"Catherine Mccormack"
]
}
maybe someone knows why the second title ... (in this case not sara bareilles as i said before, but) Hillary Weeks - who has a track called 'brave' ... why is it before title 'braveheart' and 'brave little toaster'?
EDIT AGAIN
to further complicate the situation, what if i had a 'rank' field that was a part of my document? i'm finding it very difficult to add that to my _score field using a script score function...
"functions": [
{
"script_score": {
"script": "_score * 1/ doc['rank'].value"
}
}
]
Suppose I have an RDF graph that looks like the following:
entity1 [
title [ obect1.svg]
description [
"This is sentence 1. This is sentence 2." ]
] .
entity2 [
title [ obect2.svg]
description [
"This is sentence 3. This is sentence 4." ]
].
entity3 [
title [ obect3.svg]
description [
"This is sentence 1. This is sentence 4." ]
] .
How would I write a query to find This is sentence 2?
See http://www.w3.org/TR/sparql11-query/#func-strings
Functions REGEX or CONTAINS.
I'm trying to document some SQL and wanted to get the right terminology. If you write SQL like so;
select child.ID, parent.ID
from hierarchy child
inner join hierarchy parent
on child.parentId = parent.ID
Then you have one actual table ('hierarchy') which you are giving two names ('parent' and 'child') My question is about how you refer to the logical entity of a table with a name.
What would you write in the blank here for the name?
"This query uses one table (hierarchy) but two _ (child and parent)"
[edit] left a previous draft in the question. now corrected.
I believe this is called a SELF JOIN. A and B (or "child" and "parent", I think you have a typo in your question) are called ALIASes or TABLE ALIASes.
The concept is a self join. However, the a is a syntax error. The table is hierarchy, the alias is child.
I would call each part of a self join an instance.
In the SQL Server docs, the term is table_source :
Specifies a table, view, or derived table source, with or without an alias, to use in the Transact-SQL statement
In the BNF grammar, it's:
<table_source> ::=
{
table_or_view_name [ [ AS ] table_alias ] [ <tablesample_clause> ]
[ WITH ( < table_hint > [ [ , ]...n ] ) ]
| rowset_function [ [ AS ] table_alias ]
[ ( bulk_column_alias [ ,...n ] ) ]
| user_defined_function [ [ AS ] table_alias ] [ (column_alias [ ,...n ] ) ]
| OPENXML <openxml_clause>
| derived_table [ AS ] table_alias [ ( column_alias [ ,...n ] ) ]
| <joined_table>
| <pivoted_table>
| <unpivoted_table>
| #variable [ [ AS ] table_alias ]
| #variable.function_call ( expression [ ,...n ] ) [ [ AS ] table_alias ] [ (column_alias [ ,...n ] ) ]
'child', 'parent'
The term used in the SQL-92 Standard spec is "correlation name", being a type of "identifier".
'hierarchy'
The term used in the SQL-92 Standard spec is "table".
Hence the answer to your (edited) question is:
This query uses one table (hierarchy)
but two correlation names (child and
parent).