Aggregate SQLite query across multiple tables using JSON1 - sql

I can't get my head around the following problem. The other day I learned how to use the JSON1 family of functions, but this time it seems to be more of an SQL issue.
This is my database setup:
CREATE TABLE persons(id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT UNIQUE)
CREATE TABLE interests(id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT UNIQUE)
CREATE TABLE persons_interests(person INTEGER, interest INTEGER, FOREIGN KEY(person) REFERENCES persons(id), FOREIGN KEY(interest) REFERENCES interests(id))
INSERT INTO persons(name) VALUES('John')
INSERT INTO persons(name) VALUES('Jane')
INSERT INTO interests(name) VALUES('Cooking')
INSERT INTO interests(name) VALUES('Gardening')
INSERT INTO interests(name) VALUES('Relaxing')
INSERT INTO persons_interests VALUES(1, 1)
INSERT INTO persons_interests VALUES(1, 2)
INSERT INTO persons_interests VALUES(2, 3)
Based on this data I'd like to get the following output, which is all interests of all persons aggregated into a single JSON array:
[{name: John, interests:[{name: Cooking},{name: Gardening}]}, {name: Jane, interests:[{name: Relaxing}]}]
Now the following is what I tried to do. Needless to say, this doesn't give me what I want:
SELECT p.name, json_object('interests', json_group_array(json_object('name', i.name))) interests
FROM persons p, interests i
JOIN persons_interests pi ON pi.person = p.id AND pi.interest = i.id
The undesired output is:
John|{"interests":[{"name":"Cooking"},{"name":"Gardening"},{"name":"Relaxing"}]}
Any help is highly appreciated!

For using json_group_array you must group line , in your case by person , except you want only one row with all your results .
Example 1)
This first version , will give you 1 json object by person , so the result will be N rows for N persons :
SELECT json_object( 'name ',
p.name,
'interests',
json_group_array(json_object('name', i.name))) jsobjects
FROM persons p, interests i
JOIN persons_interests pi ON pi.person = p.id AND pi.interest = i.id
group by p.id ;
Example 2)
This second version , will give return 1 big json array that contains all persons , but you fetch only one row .
SELECT json_group_array(jsobjects)
FROM (
SELECT json_object( 'name ',
p.name,
'interests',
json_group_array(json_object('name', i.name))) jsobjects
FROM persons p, interests i
JOIN persons_interests pi ON pi.person = p.id AND pi.interest = i.id
group by p.id
) jo ;

Related

How to join tables together via Ids using SQLite?

I am having trouble joining parts of tables. I want first and last names of the people and whatever their interest is to be joined together. I get this error message: "[1] [SQLITE_ERROR] SQL error or missing database (ambiguous column name: pi.PersonID)"
CREATE TABLE people (
PersonID INTEGER PRIMARY KEY AUTOINCREMENT,
FirstName VARCHAR(100),
LastName VARCHAR(100)
);
INSERT INTO people (FirstName, LastName)
VALUES ('Walter', 'White'),
('Jesse', 'Pinkman'),
('Saul', 'Goodman');
SELECT * FROM people;
CREATE TABLE interests (
InterestID INTEGER PRIMARY KEY AUTOINCREMENT,
Interest VARCHAR(100)
);
INSERT INTO interests (Interest)
values ('Swimming'),
('Basketball'),
('Running');
SELECT * FROM interests;
CREATE TABLE persons_interests (
PersonID INTEGER,
InterestID INTEGER,
PRIMARY KEY (PersonID, InterestID),
FOREIGN KEY (PersonID) REFERENCES people,
FOREIGN KEY (InterestID) REFERENCES interests
);
DROP TABLE persons_interests;
INSERT INTO persons_interests (PersonID, InterestID)
VALUES (1, 3),
(2, 2),
(3, 3);
SELECT * FROM persons_interests;
SELECT FirstName, LastName, Interest FROM people p, interests i
JOIN persons_interests pi on p.PersonID = pi.PersonID
JOIN persons_interests pi on i.Interest = pi.InterestID;
Don't mix implicit an explicit joins! You seem to want:
select p.firstname, p.lastname, i.interest
from people p
inner join persons_interests pi on pi.personid = p.personid
inner join interests i on i.interestid = pi.interestid;
Here, each table appears just once in the from clause, with the relevant join conditions.

Insert multiple references in a nested table

I have the table customer_table containing a list (nested table) of references toward rows of the account_table.
Here are my declarations :
Customer type:
CREATE TYPE customer as object(
custid integer,
infos ref type_person,
accounts accounts_list
);
accounts_list type:
CREATE TYPE accounts_list AS table of ref account;
Table:
CREATE TABLE customer_table OF customer(
custid primary key,
constraint c_inf check(infos is not null),
constraint c_acc check(accounts is not null)
)
NESTED TABLE accounts STORE AS accounts_refs_nt_table;
So I would like to insert multiple refs in my nested table when I create a customer, as an account can be shared.
I can't find out how to do that.
I tried:
INSERT INTO customer_table(
SELECT 0,
ref(p),
accounts_list(
SELECT ref(a) FROM account_table a WHERE a.accid = 0
UNION ALL
SELECT ref(a) FROM account_table a WHERE a.accid = 1
)
FROM DUAL
FROM person_table p
WHERE p.personid = 0
);
With no success.
Thank you
You can use the collect() function, e.g. in a subquery:
INSERT INTO customer_table(
SELECT 0,
ref(p),
(
SELECT CAST(COLLECT(ref(a)) AS accounts_list)
FROM account_table a
WHERE accid IN (0, 1)
)
FROM person_table p
WHERE p.personid = 0
);
As the documentation says, "To get accurate results from this function you must use it within a CAST function", so I've explicitly cast it to your account_list type.
If you don't want a subquery you could instead do:
INSERT INTO customer_table(
SELECT 0,
ref(p),
CAST(COLLECT(a.r) AS accounts_list)
FROM person_table p
CROSS JOIN (SELECT ref(a) AS r FROM account_table a WHERE accid IN (0, 1)) a
WHERE p.personid = 0
GROUP BY ref(p)
);
but I think that's a bit messier; check the performance of both though...

Select rows that have a specific set of items associated with them through a junction table

Suppose we have the following schema:
CREATE TABLE customers(
id INTEGER PRIMARY KEY,
name TEXT
);
CREATE TABLE items(
id INTEGER PRIMARY KEY,
name TEXT
);
CREATE TABLE customers_items(
customerid INTEGER,
itemid INTEGER,
FOREIGN KEY(customerid) REFERENCES customers(id),
FOREIGN KEY(itemid) REFERENCES items(id)
);
Now we insert some example data:
INSERT INTO customers(name) VALUES ('John');
INSERT INTO customers(name) VALUES ('Jane');
INSERT INTO items(name) VALUES ('duck');
INSERT INTO items(name) VALUES ('cake');
Let's assume that John and Jane have id's of 1 and 2 and duck and cake also have id's of 1 and 2.
Let's give a duck to John and both a duck and a cake to Jane.
INSERT INTO customers_items(customerid, itemid) VALUES (1, 1);
INSERT INTO customers_items(customerid, itemid) VALUES (2, 1);
INSERT INTO customers_items(customerid, itemid) VALUES (2, 2);
Now, what I want to do is to run two types of queries:
Select names of customers who have BOTH a duck and a cake (should return 'Jane' only).
Select names of customers that have a duck and DON'T have a cake (should return 'John' only).
For the two type of queries listed, you could use the EXISTS clause. Below is an example query using the exists clause:
SELECT cust.name
from customers AS cust
WHERE EXISTS (
SELECT 1
FROM items
INNER JOIN customers_items ON items.id = customers_items.itemid
INNER JOIN customers on customers_items.customerid = cust.id
WHERE items.name = 'duck')
AND NOT EXISTS (
SELECT 1
FROM items
INNER JOIN customers_items ON items.id = customers_items.itemid
INNER JOIN customers on customers_items.customerid = cust.id
WHERE items.name = 'cake')
Here is a working example: http://sqlfiddle.com/#!6/3d362/2

SQL help on query

I want to create a query for the cheapest package for a holiday to Spain, given package ID. I'm just stuck to how to go about it when executing my query. I need help on what to include in the values for the 'package' table and I also need help on how to present the query.
Here is the table:
USE [zachtravelagency]
CREATE TABLE package (
[packageID] INTEGER NOT NULL IDENTITY (1,1) PRIMARY KEY,
[hotelID] INTEGER FOREIGN KEY REFERENCES hotels NOT NULL,
[excursionID] INTEGER FOREIGN KEY REFERENCES excursions NOT NULL,
[transportID] INTEGER FOREIGN KEY REFERENCES transport NOT NULL,
[flightID] INTEGER FOREIGN KEY REFERENCES flight NOT NULL,
);
Here are the columns, followed by some NULL values as I'm not sure what to put in.
Insert Into package (packageID, hotelID, excursionID, transportID, flightID)
Values (1, '', '', '', '')
Here is an example of entering data into my 'hotel' table (this is an example of one row)
Insert Into hotels (hotelID, hotelName, numRooms, location, totalCost, rating)
Values (1, 'Supreme Oyster Resort & Spa', '255', 'Spain', '250', '4')
I'm new to SQL so thank you for your patience.
First, for your insert statement for 'package', you don't specify packageId since it's an identity column. Instead it should look something like this
Insert Into package (hotelID, excursionID, transportID, flightID)
Values (1, 54, 43, 23)
Then to run a SELECT Query to find the cheapest package to Spain you will have to join your hotel, excursion, transport, and flight table on package, and sum the totalCost from each of the tables.
Example:
SELECT p.*, (h.totalCost + e.totalCost + t.totalCost, f.totalCost) as 'Total Package Cost' FROM Package p
INNER JOIN hotel h ON h.hotelId = p.hotelId
INNER JOIN excursion e ON e.excursionId = p.excursionId
INNER JOIN transport t ON t.transportId = p.transportId
INNER JOIN flight f ON f.flightId = p.flightId
WHERE h.location = 'Spain'
ORDER BY (h.totalCost + e.totalCost + t.totalCost, f.totalCost) ASC
Your cheapest packages will be listed first. If you only want the cheapest then you can use SELECT TOP 1
This query also assumes that each of the tables had a totalCost column.
Apparently you need to create a total of five tables. Because of the foreign keys you'll have to insert data in the packages table last. Let's assume all that is completed and you now want to query.
If you're given the packageID then you already have the answer. I'm not sure what you mean by that. If you want the minimum cost of a package that has a hotel in Spain then do this:
select min(h.totalCost)
from package as p inner join hotels as h on h.hotelID = p.hotelID
where h.location = 'Spain'
If you want packages that include a hotel in Spain of the lowest cost, try this. It could match more than one:
select * from package where hotelID in (
select hotelID from hotels where totalCost = (
select min(h.totalCost)
from package as p inner join hotels as h on h.hotelID = p.hotelID
where where p.packageID = ? and h.location = 'Spain'
)
)
It's really hard to help you out with what data you should enter in Package table. It can be anything. As long as, data is of the same type as of the type you have provided for each column. Since, all the columns in Package table are integers, you can add any number. Don't put them in '' though. It makes them string. E.g. I'll write following to insert data into Package table:
Insert Into package (packageID, hotelID, excursionID, transportID, flightID)
Values (1, 777, 7777, 4444) -- Doesn't matter what value you put, unless you have other Hotel, Excursion, Transport and Flight table which contains Id as primary key, then you need to use that.
Similarly, you can insert more records into both tables. After that, use the query provided by user below shawnt00 and it should return you some result.

SQL joins with multiple records into one with a default

My 'people' table has one row per person, and that person has a division (not unique) and a company (not unique).
I need to join people to p_features, c_features, d_features on:
people.person=p_features.num_value
people.division=d_features.num_value
people.company=c_features.num_value
... in a way that if there is a record match in p_features/d_features/c_features only, it would be returned, but if it was in 2 or 3 of the tables, the most specific record would be returned.
From my test data below, for example, query for person=1 would return
'FALSE'
person 3 returns maybe, person 4 returns true, and person 9 returns default
The biggest issue is that there are 100 features and I have queries that need to return all of them in one row. My previous attempt was a function which queried on feature,num_value in each table and did a foreach, but 100 features * 4 tables meant 400 reads and it brought the database to a halt it was so slow when I loaded up a few million rows of data.
create table p_features (
num_value int8,
feature varchar(20),
feature_value varchar(128)
);
create table c_features (
num_value int8,
feature varchar(20),
feature_value varchar(128)
);
create table d_features (
num_value int8,
feature varchar(20),
feature_value varchar(128)
);
create table default_features (
feature varchar(20),
feature_value varchar(128)
);
create table people (
person int8 not null,
division int8 not null,
company int8 not null
);
insert into people values (4,5,6);
insert into people values (3,5,6);
insert into people values (1,2,6);
insert into p_features values (4,'WEARING PANTS','TRUE');
insert into c_features values (6,'WEARING PANTS','FALSE');
insert into d_features values (5,'WEARING PANTS','MAYBE');
insert into default_features values('WEARING PANTS','DEFAULT');
You need to transpose the features into rows with a ranking. Here I used a common-table expression. If your database product does not support them, you can use temporary tables to achieve the same effect.
;With RankedFeatures As
(
Select 1 As FeatureRank, P.person, PF.feature, PF.feature_value
From people As P
Join p_features As PF
On PF.num_value = P.person
Union All
Select 2, P.person, PF.feature, PF.feature_value
From people As P
Join d_features As PF
On PF.num_value = P.division
Union All
Select 3, P.person, PF.feature, PF.feature_value
From people As P
Join c_features As PF
On PF.num_value = P.company
Union All
Select 4, P.person, DF.feature, DF.feature_value
From people As P
Cross Join default_features As DF
)
, HighestRankedFeature As
(
Select Min(FeatureRank) As FeatureRank, person
From RankedFeatures
Group By person
)
Select RF.person, RF.FeatureRank, RF.feature, RF.feature_value
From people As P
Join HighestRankedFeature As HRF
On HRF.person = P.person
Join RankedFeatures As RF
On RF.FeatureRank = HRF.FeatureRank
And RF.person = P.person
Order By P.person
I don't know if I had understood very well your question, but to use JOIN, you need your table loaded already and then use the SELECT statement with INNER JOIN, LEFT JOIN or whatever you need to show.
If you post some more information, maybe turn it easy to understand.
There are some aspects of your schema I'm not understanding, like how to relate to the default_features table if there's no match in any of the specific tables. The only possible join condition is on feature, but if there's no match in the other 3 tables, there's no value to join on. So, in my example, I've hard-coded the DEFAULT since I can't think of how else to get it.
Hopefully this can get you started and if you can clarify the model a bit more, the solution can be refined.
select p.person, coalesce(pf.feature_value, df.feature_value, cf.feature_value, 'DEFAULT')
from people p
left join p_features pf
on p.person = pf.num_value
left join d_features df
on p.division = df.num_value
left join c_features cf
on p.company = cf.num_value