is there a working example to map lat long properties from graphdb to geo_point objects on elastic search ?
{
"fieldName": "location",
"propertyChain": [
"http://example.com/coordinates"
],
"objectFields": [
{
"fieldName": "lat",
"propertyChain": [
"http://www.w3.org/2003/01/geo/wgs84_pos#lat"
]
},
{
"fieldName": "lon",
"propertyChain": [
"http://www.w3.org/2003/01/geo/wgs84_pos#long"
]
}
]
}
thanks
The only way to index data as geo_point with the current version of GraphDB and the Elasticsearch connector is to have the latitude and the longitude in a single literal, e.g. with the property http://www.w3.org/2003/01/geo/wgs84_pos#lat_long. The connector would look like this:
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA {
inst:geopoint :createConnector '''
{
"elasticsearchNode": "localhost:9300",
"types": ["http://geopoint.ontotext.com/Point"],
"fields": [
{
"fieldName": "location",
"propertyChain": [
"http://www.w3.org/2003/01/geo/wgs84_pos#lat_long"
],
"datatype": "native:geo_point"
}
],
}
''' .
}
Note that datatype: "native:geo_point" is important as it tells Elasticsearch what type of data this is.
We are currently looking into possible ways to introduce support for latitude and longitude coming from separate literals.
Related
I have a scenario where I need to call a secondary feature file that contains an API call where the response is a JSON object. However, I need to call this scenario multiple times, so I am using karate.repeat to achieve this. However, the resulting response is a malformed JSON that I cannot traverse.
This is what I am doing:
* def fun = function(i){ return karate.call('abc.feature#abc', value)}
* def loop = karate.repeat(2, fun)
* karate.log(loop)
The response I get is:
{
"Total_packages1": {
"package1": {
"tags": [
"kj21",
"j1",
"sj2",
"z1"
],
"expectedResponse": [
{
"firstName": "Name",
"lastName": "lastName",
"purchase": [
{
"title": "title",
"category": [
"a",
"b",
"c"
]
}
]
}
]
}
}
}
{
"Total_packages2": {
"package2": {
"tags": [
"kj212",
"j12",
"sj22",
"z12"
],
"expectedResponse": [
{
"firstName": "Name2",
"lastName": "lastName2",
"purchase": [
{
"title": "title2",
"category": [
"a2",
"b2",
"c2"
]
}
]
}
]
}
}
}
As you can see, Total_packages2 starts malformed. I need to grab the "tags" values from each package, however, I cannot simply do Total_packages1.package1.tags like I could with a single response in the JSON.
If I cannot achieve what I need by karate.repeat, is there another method that is recommended for looping like this? I haven't found anything in the documentation for this particular scenario.
Don't use karate.repeat() use call with a JSON array. Read this part of the docs: https://github.com/karatelabs/karate#data-driven-features
I have a data source A and I'd like to create a new data source B containing just the last element of A. What is the best way to do this in Vega?
This is relatively straight forward to do. Although I am slightly confused by your use of "max" in the aggregation since this isn't the last value?
Either way here is my solution for obtaining the last value in a dataset using this series of transforms,
transform: [
{
type: window
ops: [
row_number
]
}
{
type: joinaggregate
fields: [
row_number
]
ops: [
max
]
as: [
max_row_number
]
}
{
type: filter
expr: datum.row_number==datum.max_row_number
}
]
I was able to get this working in the Vega Editor using the following:
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"data": [
{
"name": "source",
"url": "https://raw.githubusercontent.com/vega/vega/master/docs/data/cars.json",
"transform": [
{
"type": "filter",
"expr": "datum['Horsepower'] != null && datum['Miles_per_Gallon'] != null && datum['Acceleration'] != null"
}
]
},
{
"name": "avg",
"source":"source",
"transform":[
{
"type":"aggregate",
"groupby":["Horsepower"],
"ops": ["average"],
"fields":["Miles_per_Gallon"],
"as":["Avg_Miles_per_Gallon"]
}
]
},
{
"name":"last",
"source": "avg",
"transform": [
{
"type": "aggregate",
"ops": ["max"],
"fields": ["Horsepower"],
"as": ["maxHorsepower"]
},
{
"type": "lookup",
"from": "avg",
"key": "Horsepower",
"fields": ["maxHorsepower"],
"values": ["Horsepower","Avg_Miles_per_Gallon"]
}
]
}
]
}
maxHorsepower
Horsepower
Avg_Miles_per_Gallon
230
230
16
I'd be interested to know if there are better ways, but this worked for me.
I'm going to pose a question about indexes in GraphDB Lucene connector.
In the context of a multilingual rdf resource, how is it possible to index the rdfs:label values of a single language (for example english) ?
I tried with this:
PREFIX inst: <http://www.ontotext.com/connectors/lucene/instance#>
PREFIX : <http://www.ontotext.com/connectors/lucene#>
INSERT DATA {
inst:lexicalEntryIndex :createConnector '''
{
"types": [
"http://www.w3.org/ns/lemon/ontolex#LexicalEntry"
],
"fields": [
{
"fieldName": "type",
"propertyChain": [
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
"http://www.w3.org/2000/01/rdf-schema#label"
],
"languages": [
"en"
]
}
]
}
''' .
}
but all the languages are indexed.
Thanks in advance,
Andrea
The GraphDB Lucene Connector documentation clearly demonstrates how to index a single language.
Here is a sample snippet how to do it:
PREFIX luc: <http://www.ontotext.com/connectors/lucene#>
PREFIX luc-index: <http://www.ontotext.com/connectors/lucene/instance#>
INSERT DATA {
luc-index:my_index luc:createConnector '''
{
"types": ["http://www.ontotext.com/example#gadget"],
"fields": [
{
"fieldName": "name",
"propertyChain": [
"http://www.ontotext.com/example#name"
]
},
{
"fieldName": "nameLanguage",
"propertyChain": [
"http://www.ontotext.com/example#name",
"lang()"
]
}
], "entityFilter":"?nameLanguage in (\\"en\\")"
}
''' .
}
Reading through:
https://www.microsoft.com/cognitive-services/en-us/Academic-Knowledge-API/documentation/GraphSearchMethod
It is a bit obscure the meaning of "path":
"path": "/paper/AuthorIDs/author" - I don't see authorIds object in the returned results.
# post data query
{
"path": "/paper/AuthorIDs/author",
"paper": {
"type": "Paper",
"NormalizedTitle": "graph engine",
"select": [
"OriginalTitle"
]
},
"author": {
"return": {
"type": "Author",
"Name": "bin shao"
}
}
}
#results
{
"Results": [
[
{
"CellID": 2160459668,
"OriginalTitle": "Trinity: a distributed graph engine on a memory cloud"
},
{
"CellID": 2093502026
}
],
[
{
"CellID": 2171539317,
"OriginalTitle": "A distributed graph engine for web scale RDF data"
},
{
"CellID": 2093502026
}
],
[
{
"CellID": 2411554868,
"OriginalTitle": "A distributed graph engine for web scale RDF data"
},
{
"CellID": 2093502026
}
],
[
{
"CellID": 73304046,
"OriginalTitle": "The Trinity graph engine"
},
{
"CellID": 2093502026
}
]
]
}
Which is the correct path (or data to post) to query for citation and co-citation of an article, and paginate results?
You will find AuthorIDs on the graph schema from Microsoft Academic Search:
Assuming you know the ID of the source paper (2118322263 in the following example), here is the POST part of the request:
{
"path": "/paper/CitationIDs/citation",
"paper": {
"type": "Paper",
"id": [ 2118322263 ],
"select": [
"OriginalTitle"
]
},
"citation": {
"return": {
"type": "Paper"
},
"select": [
"OriginalTitle"
]
}
}
This returns 634 results in one response, while a query to the paper itself shows a citation count of 732. I have no idea why there is a difference, nor how to do pagination.
I have the following documents in ElasticSearch 0.19.11, using:
{ "title": "dogs species",
"col_names": [ "name", "description", "country_of_origin" ],
"rows": [
{ "row": [ "Boxer", "good dog", "Germany" ] },
{ "row": [ "Irish Setter", "great dog", "Ireland" ] }
]
}
{ "title": "Misc stuff",
"col_names": [ "foo" ],
"rows": [
{ "row": [ "Setter is impotant" ] },
{ "row": [ "Ireland is green" ] }
]
}
The mapping is as follows:
{
"table" : {
"properties" : {
"title" : {"type" : "string"},
"col_names" : {"type" : "string"},
"rows" : {
"properties" : {
"row" : {"type" : "string"}
}
}
}
}
}
Question: I'm now searching for "Ireland Setter" and I need to have a higher score for documents that have search terms in the same row.
Currently the second document gets score of 0.22, while the first one - 0.14.
I want the first document to get a higher score in this case, since it has both "Ireland" and "Setter" in the same row. How can it be done?
With great cooperation from ElasticSearch google-group members, the solution is found.
Here is the link to the discussion: https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/4O9dff2SNhg