Duplicate output results with SPARQL [duplicate] - sparql

This question already has an answer here:
Aggregating results from SPARQL query
(1 answer)
Closed 9 years ago.
I'm new to the semantic web concept, and I have a diploma work based on the semantic web. In my OWL ontology, I have defined the individuals in he following manner:
<Announce rdf:ID="Ann1" ...>
<...>
<...>
<requiredTechnologies>
<Technology rdfs:resource="tech1"/>
</requiredTechnologies>
<requiredTechnologies>
<Technology rdfs:resource="tech2"/>
</requiredTechnologies>
</Announce>
...
Basically, there are some properties that appear only once for a given individual, but the property "required technologies" can be used multiple times in one record (for one individual). So, when I run a SPARQL query, selecting all the data (I use Jena), I get this output:
=====================================
| "Ann1" | ... | ... | "tech1" |
| "Ann1" | ... | ... | "tech2" |
| "Ann2" | ... | ... | "tech3" |
| "Ann3" | ... | ... | "tech4" |
| "Ann3" | ... | ... | "tech5" |
| "Ann3" | ... | ... | "tech6" |
| ... | ... | ... | ... |
=====================================
The records where multiple "requiredTechnologies" attribute is present appear more then once. My question is how to get all the technologies that belong to a given individual in one line (something like this):
===================================================
| "Ann1" | ... | ... | "tech1, tech2" |
| "Ann2" | ... | ... | "tech2" |
| "Ann3" | ... | ... | "tech4, tech5, tech6" |
| ... | ... | ... | ... |
===================================================
Is there a way in SPARQL to get the multiple attribute in a list? Or any other workaround?

Yes, use the GROUP BY clause in conjunction with the GROUP_CONCAT aggregate.
Here's a generic example, you should easily be able to adapt to your data model:
SELECT ?s (GROUP_CONCAT(?value ; SEPARATOR = ",") AS ?values)
WHERE
{
?s <http://somePredicate> ?value .
}
GROUP BY ?s

Related

Get similar employees based on their attribute values

Consider the following sample table("Customer") with these records
=========
Customer
=========
-----------------------------------------------------------------------------------------------
| customer-id | att-a | att-b | att-c | att-d | att-e | att-f | att-g | att-h | att-i | att-j |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-1 | att-a-7 | att-b-3 | att-c-10 | att-d-10 | att-e-15 | att-f-11 | att-g-2 | att-h-7 | att-i-5 | att-j-14 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-2 | att-a-9 | att-b-7 | att-c-12 | att-d-4 | att-e-10 | att-f-4 | att-g-13 | att-h-4 | att-i-1 | att-j-13 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-3 | att-a-10 | att-b-6 | att-c-1 | att-d-1 | att-e-13 | att-f-12 | att-g-9 | att-h-6 | att-i-7 | tt-j-4 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-19 | att-a-7 | att-b-9 | att-c-13 | att-d-5 | att-e-8 | att-f-5 | att-g-12 | att-h-14 | att-i-13 | att-j-15 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
I have these records and many more records dumped into SQL database and wanted to find top 10 similar customer based on the attribute value. For example customer-1 and customer-19 have atleast one column value matching .i.e "att-a-7" so the output should give me 2 customer-id's or top similar customer that are customer-1 and customer-19.
P.S - there can be one or more columns similar across rows.
I'm using windowing technique to find top 10 similar customer and im not sure if I'm correct.
following is my approach I used in my query :
row_number() over (partition by att-a, att-b,..,att-j order by customer-id) as customers
is this correct. ?

Query from secondary index on aerospike

I'm considering aerospike for one of our projects. So I currently created a 3 node cluster and loaded some data on it.
Sample data
ns: imei
set: imei_data
+-------------------+-----------------------+-----------------------+----------------------------+--------------+--------------+
| imsi | fcheck | lcheck | msc | fcheck_epoch | lcheck_epoch |
+-------------------+-----------------------+-----------------------+----------------------------+--------------+--------------+
| "413010324064956" | "2017-03-01 14:30:26" | "2017-03-01 14:35:30" | "13d20b080011044917004100" | 1488358826 | 1488359130 |
| "413012628090023" | "2016-09-21 10:06:49" | "2017-09-16 13:54:40" | "13dc0b080011044917006100" | 1474432609 | 1505550280 |
| "413010130130320" | "2016-12-29 22:05:07" | "2017-10-09 16:17:10" | "13d20b080011044917003100" | 1483029307 | 1507546030 |
| "413011330114274" | "2016-09-06 01:48:06" | "2017-10-09 11:53:41" | "13d20b080011044917003100" | 1473106686 | 1507530221 |
| "413012629781993" | "2017-08-16 16:03:01" | "2017-09-13 18:10:48" | "13dc0b080011044917004100" | 1502879581 | 1505306448 |
Then I created a secondary index on lcheck_epoch using AQL since I want to query based on date.
create index idx_lcheck on imei.imei_data (lcheck_epoch) NUMERIC
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
| ns | bin | indextype | set | state | indexname | path | type |
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
| "imei" | "lcheck_epoch" | "NONE" | "imei_data" | "RW" | "idx_lcheck" | "lcheck_epoch" | "NUMERIC" |
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
When I execute
select imsi from imei.imei_data where idx_lcheck=1476165806
I'm getting
Error: (204) AEROSPIKE_ERR_INDEX
Please explain.
You're using the index name, not the bin name, in your query. Try this:
SELECT imsi FROM imei.imei_data WHERE lcheck_epoch=1476165806
Or
SELECT imsi FROM imei.imei_data WHERE lcheck_epoch BETWEEN 1490000000 AND 1510000000
Just a note, you can do much more complex queries using predicate filtering through several of the language clients (Java, C, C#, Go). For example the PredExp class of the Java client (see examples.)

Get all Countries Name from DBpedia using SPARQL

I want list of countries name from DBpedia.
I am using http://dbpedia.org/snorql/ to execute my query, but till now I have not found all countries name which are available in DBpedia.
For Example : dbr:United_Kingdom, dbr:India, dbr:United_States, etc.
Is the problem
You don't know how to write SPARQL in general? (That's OK, it's hard to get started.)
You don' know about the classes and predicates in DBpedia? (That's OK too. I have to check each time.)
Something else?
This gets all UN member nations. Getting "all countries" is probably just a matter of finding the right class in their ontology.
select distinct ?s
where { ?s a <http://dbpedia.org/class/yago/WikicatMemberStatesOfTheUnitedNations> }
I happen to know that New York City is the largest city in the United States, and that DBpedia has a largestCity predicate and a New_York_City instance. So I wrote a query that should only get the United States as the subject, and then asked for all connected predicates and objects. You should look in that for an object that meets your exceptions for defining "all countries." If you don't find one, you may have to union a few other triple patterns into one query.
I have also filtered out objects that contain either of two terms that be relevant for you: "country" or "nation"
select distinct *
where { ?s a <http://dbpedia.org/class/yago/WikicatMemberStatesOfTheUnitedNations> ;
dbo:largestCity dbr:New_York_City ;
?p ?o
filter(isURI(?o))
filter((regex(lcase(str(?o)), "country")) || (regex(lcase(str(?o)), "nation")))
}
Gives the foloowing, which should help you write a followup question that isn't specific to the United States.
+--------------------------------------------+---------------------------------------------------+------------------------------------------------------------------------------+
| s | p | o |
+--------------------------------------------+---------------------------------------------------+------------------------------------------------------------------------------+
| http://dbpedia.org/resource/United_States | http://dbpedia.org/ontology/wikiPageExternalLink | http://www.ifs.du.edu/ifs/frm_CountryProfile.aspx?Country=US |
| http://dbpedia.org/resource/United_States | http://dbpedia.org/ontology/wikiPageExternalLink | http://nationalatlas.gov/ |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://dbpedia.org/class/yago/Country108544813 |
| http://dbpedia.org/resource/United_States | http://purl.org/dc/terms/subject | http://dbpedia.org/resource/Category:G7_nations |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://schema.org/Country |
| http://dbpedia.org/resource/United_States | http://www.w3.org/2000/01/rdf-schema#seeAlso | http://dbpedia.org/resource/Anti-miscegenation_laws |
| http://dbpedia.org/resource/United_States | http://purl.org/dc/terms/subject | http://dbpedia.org/resource/Category:Member_states_of_the_United_Nations |
| http://dbpedia.org/resource/United_States | http://www.w3.org/2002/07/owl#sameAs | http://transparency.270a.info/classification/country/US |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://dbpedia.org/ontology/Country |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://dbpedia.org/class/yago/WikicatMemberStatesOfTheUnitedNations |
| http://dbpedia.org/resource/United_States | http://purl.org/dc/terms/subject | http://dbpedia.org/resource/Category:G8_nations |
| http://dbpedia.org/resource/United_States | http://www.w3.org/2002/07/owl#sameAs | http://linked-web-apis.fit.cvut.cz/resource/united_states_of_america_country |
| http://dbpedia.org/resource/United_States | http://dbpedia.org/ontology/wikiPageExternalLink | http://www.nationalcenter.org/HistoricalDocuments.html |
| http://dbpedia.org/resource/United_States | http://dbpedia.org/ontology/wikiPageExternalLink | http://news.bbc.co.uk/2/hi/americas/country_profiles/1217752.stm |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://umbel.org/umbel/rc/Country |
| http://dbpedia.org/resource/United_States | http://purl.org/dc/terms/subject | http://dbpedia.org/resource/Category:G20_nations |
+--------------------------------------------+---------------------------------------------------+------------------------------------------------------------------------------+

Join tables on a geo range

I have 2 independent tables (Links & Sections). The information inside link table is somewhat like this,
Links Table
----------------------------------------------------------------------------
| id | Description | startLat | startLng | endLat | endLng |
----------------------------------------------------------------------------
| 1000259 | Link Number 1 |52.891372 |-1.768254 |52.892545 |-1.775278 |
| 1000359 | Link Number 2 |52.894892 |-1.780513 |52.894306 |-1.774793 |
| 1000279 | Link Number 3 |52.894306 |-1.774793 |52.895000 |-1.765273 |
| 1000258 | Link Number 4 |52.895000 |-1.765273 |52.895500 |-1.755278 |
| 1000255 | Link Number 5 |52.895500 |-1.755278 |52.896500 |-1.745278 |
| 1000555 | Link Number 6 |52.896500 |-1.745278 |52.897250 |-1.735278 |
----------------------------------------------------------------------------
And the Sections table appears like this,
Sections Table
-----------------------------------------------------------------------------
| id | Description | fromLat | fromLng | toLat | toLng |
-----------------------------------------------------------------------------
| 625 | Section 1 | 52.893598 | -1.775120 | 52.885053|-1.756409 |
| 713 | Section 2 | 52.897273 | -1.788324 | 52.898285|-1.724721 |
| ... | ... | ... | ... | ... | ... |
| ... | ... | ... | ... | ... | ... |
-----------------------------------------------------------------------------
I want to run a query which gives me all the links which are covered by a section. So if I say how many links come under Section 2, I should receive all the links that are covered by the lat/lng of section information.
P.S. Please note, Sections are longer than Links ... Any Help!!!
You do this with a join, just the conditions are inequalities rather than equalities. It is unclear whether you want partial coverage or total coverage. Here is an example for partial coverage:
select s.*, l.*
from sections s join
links l
on (l.startlat < s.tolat and l.endlast > s.fromlat) and
(l.startlong < s.tolong and l.endlong > s.fromlong);

Grouped string aggregation / LISTAGG for SQL Server

I'm sure this has been asked but I can't quite find the right search terms.
Given a schema like this:
| CarMakeID | CarMake
------------------------
| 1 | SuperCars
| 2 | MehCars
| CarMakeID | CarModelID | CarModel
-----------------------------------------
| 1 | 1 | Zoom
| 2 | 1 | Wow
| 3 | 1 | Awesome
| 4 | 2 | Mediocrity
| 5 | 2 | YoureSettling
I want to produce a dataset like this:
| CarMakeID | CarMake | CarModels
---------------------------------------------
| 1 | SuperCars | Zoom, Wow, Awesome
| 2 | MehCars | Mediocrity, YoureSettling
What do I do in place of 'AGG' for strings in SQL Server in the following style query?
SELECT *,
(SELECT AGG(CarModel)
FROM CarModels model
WHERE model.CarMakeID = make.CarMakeID
GROUP BY make.CarMakeID) as CarMakes
FROM CarMakes make
http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/
It is an interesting problem in Transact SQL, for which there are a number of solutions and considerable debate. How do you go about producing a summary result in which a distinguishing column from each row in each particular category is listed in a 'aggregate' column? A simple, and intuitive way of displaying data is surprisingly difficult to achieve. Anith Sen gives a summary of different ways, and offers words of caution over the one you choose...
If it is SQL Server 2017 or SQL Server VNext, Azure SQL database you can use String_agg as below:
SELECT make.CarMakeId, make.CarMake,
CarModels = string_agg(model.CarModel, ', ')
FROM CarModels model
INNER JOIN CarMakes make
ON model.CarMakeId = make.CarMakeId
GROUP BY make.CarMakeId, make.CarMake
Output:
+-----------+-----------+---------------------------+
| CarMakeId | CarMake | CarModels |
+-----------+-----------+---------------------------+
| 1 | SuperCars | Zoom, Wow, Awesome |
| 2 | MehCars | Mediocrity, YoureSettling |
+-----------+-----------+---------------------------+