Join only when relation data exists - sql

These are my tables.
CREATE TABLE IF NOT EXISTS categories (
category_id serial primary key,
category text not null,
user_id int not null
);
CREATE TABLE IF NOT EXISTS activities (
activity_id serial primary key,
activity text not null,
user_id int not null
);
CREATE TABLE IF NOT EXISTS categories_to_activities (
category_id int not null REFERENCES categories (category_id) ON UPDATE CASCADE ON DELETE CASCADE,
activity_id int not null REFERENCES activities (activity_id) ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT category_activity_pkey PRIMARY KEY (category_id, activity_id)
);
This is the query I'm using to get all activities with their categories.
SELECT a.*, ARRAY_AGG ( ROW_TO_JSON (c) ) categories
FROM activities a
JOIN categories_to_activities ca ON a.activity_id = ca.activity_id
JOIN categories c ON ca.category_id = c.category_id
WHERE a.user_id = ${userId}
GROUP BY a.activity_id
The issue is that if an activity does not have a category assigned, it won't get returned.
I'm having trouble combining JOIN with CASE, which I suppose is what I need. Essentially I want to JOIN only when there is some record in categories_to_activities?
How do I do that?

You could use a left join, and return nulls for activities without categories:
SELECT a.*,
CASE WHEN ca.activity_id IS NOT NULL THEN ARRAY_AGG(ROW_TO_JSON(c))
ELSE ARRAY[]::JSON[]
END as categories
FROM activities a
LEFT JOIN categories_to_activities ca ON a.activity_id = ca.activity_id
LEFT JOIN categories c ON ca.category_id = c.category_id
WHERE a.user_id = ${userId}
GROUP BY a.activity_id

What I ended up with, thanks to #Murenik
SELECT a.*,
CASE WHEN ca.activity_id IS NOT NULL
THEN ARRAY_AGG ( ROW_TO_JSON (c) )
ELSE ARRAY[]::JSON[]
END as categories
FROM activities a
LEFT JOIN categories_to_activities ca ON a.activity_id = ca.activity_id
LEFT JOIN categories c ON ca.category_id = c.category_id
WHERE a.user_id = ${userId}
GROUP BY a.activity_id, ca.activity_id

Related

combine two sqlite queries that work separately

I am working on the basic chinook database, and am trying to write a sqlite query to create a view called v10BestSellingArtists for the 10 bestselling artists based on the total quantity of tracks sold (named as TotalTrackSales) order by TotalTrackSales in descending order. TotalAlbum is the number of albums with tracks sold for each artist.
I can write queries for both of them separately, but I can't figure out how to merge these two queries:
query for finding Totaltracksales:
Select
r.name as artist,
count (i.quantity) as TotalTrackSales
from albums a
left join tracks t on t.albumid == a.albumid
left join invoice_items i on i.trackid == t.trackid
left join artists r on a.artistid == r.artistid
group by r.artistid
order by 2 desc
limit 10;
and the second query for totalAlbum :
Select
r.name as artist,
count(a.artistId) from albums a
left join artists r where a.artistid == r.artistid
group by a.artistid
order by 2 desc
limit 10;
but I want one query with the columns: Artist, TotalAlbum TotalTrackSales.
Any help is appreciated.
The schema for the album table:
[Title] NVARCHAR(160) NOT NULL,
[ArtistId] INTEGER NOT NULL,
FOREIGN KEY ([ArtistId]) REFERENCES "artists" ([ArtistId])
artists table :
[ArtistId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
[Name] NVARCHAR(120)
tracks table schema:
[TrackId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
[Name] NVARCHAR(200) NOT NULL,
[AlbumId] INTEGER,
[MediaTypeId] INTEGER NOT NULL,
[GenreId] INTEGER,
[Composer] NVARCHAR(220),
[Milliseconds] INTEGER NOT NULL,
[Bytes] INTEGER,
[UnitPrice] NUMERIC(10,2) NOT NULL,
FOREIGN KEY ([AlbumId]) REFERENCES "albums" ([AlbumId])
ON DELETE NO ACTION ON UPDATE NO ACTION,
FOREIGN KEY ([GenreId]) REFERENCES "genres" ([GenreId])
ON DELETE NO ACTION ON UPDATE NO ACTION,
FOREIGN KEY ([MediaTypeId]) REFERENCES "media_types" ([MediaTypeId])
ON DELETE NO ACTION ON UPDATE NO ACTION
table invoice_items:
[InvoiceLineId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
[InvoiceId] INTEGER NOT NULL,
[TrackId] INTEGER NOT NULL,
[UnitPrice] NUMERIC(10,2) NOT NULL,
[Quantity] INTEGER NOT NULL,
FOREIGN KEY ([InvoiceId]) REFERENCES "invoices" ([InvoiceId])
ON DELETE NO ACTION ON UPDATE NO ACTION,
FOREIGN KEY ([TrackId]) REFERENCES "tracks" ([TrackId])
ON DELETE NO ACTION ON UPDATE NO ACTION
Just to merge your 2 queries, you can do the following using CTE:
with total_track_sales as (
Select
r.name as artist,
count (i.quantity) as TotalTrackSales
from albums a
left join tracks t on t.albumid == a.albumid
left join invoice_items i on i.trackid == t.trackid
left join artists r on a.artistid == r.artistid
group by r.artistid
order by 2 desc
limit 10 ),
with total_album as (
Select
r.name as artist,
count(a.artistId) as TotalAlbums from albums a
left join artists r where a.artistid == r.artistid
group by a.artistid
order by 2 desc
limit 10 )
select artist, TotalTrackSales, TotalAlbums
from total_track_sales ts inner join total_album ta
on ts.artist = ta.artist
You can try a single query using DISTINCT and combining aggregates and window functions.
select *
from (
select
r.name as artist,
count (i.quantity) as TotalTrackSales,
row_number() over (order by count (i.quantity) desc) rnT,
count (distinct a.albumid) as totalAlbums,
row_number() over (order by count (distinct a.albumid) desc) rnA,
from albums a
left join tracks t on t.albumid == a.albumid
left join invoice_items i on i.trackid == t.trackid
left join artists r on a.artistid == r.artistid
group by r.artistid
)
where rnT <= 10 or rnA <= 10

Left outer joins aggregate first

I have the following tables
CREATE TABLE categories(
id SERIAL,
);
CREATE TABLE category_translations(
id SERIAL,
name varchar not null,
locale varchar not null,
category_id integer not null
);
CREATE TABLE products(
id SERIAL,
category_id integer not null
);
CREATE TABLE line_items(
id SERIAL,
total_cents integer
product_id integer not null
);
What I'm trying to do is output a map of each category name to the sum of total of its associated line_items total_cents. Something like:
name
sum_total_cents
Fresh foods
100000
Dry products
532000
There is a uniqueness constraint that only one name for each locale will be stored. So a category will have one row for each locale stored in the category_translations table
What I currently have is
SELECT SUM(line_items.total_cents) AS sum_total_cents, ???
FROM line_items INNER JOIN products ON products.id = line_items.product_id
INNER JOIN categories ON categories.id = products.category_id
LEFT OUTER JOIN category_translations ON category_translations.category_id = categories.id
WHERE category_translations.locale ='en'
GROUP BY categories.id
I'm looking for an aggregate function to return the first name for the category. The only piece missing is that what to be written instead of the ??? as I've been facing a lot of must appear in the GROUP BY clause or be used in an aggregate function errors. In pseudo-code I'm looking for a FIRST() aggregate method in PostgreSQL that I can use
Assuming you want one random name from any locale, you can do:
select
c.id,
(select name from category_translations t
where t.category_id = c.id limit 1) as name,
sum(i.total_cents) as sum_total_cents
from categories c
left join products p on p.category_id = c.id
left join line_items i on i.product_id = p.id
group by c.id, name
Alternatively, if you want the category name for the locale 'en' then you can do:
select
c.id,
(select t.name from category_translations t
where t.category_id = c.id and t.locale ='en') as name,
sum(i.total_cents) as sum_total_cents
from categories c
left join products p on p.category_id = c.id
left join line_items i on i.product_id = p.id
group by c.id, name

How can I create a view connecting two tables show which entities in one table are not connected to entities from the other table?

I need to create a view of persons and contracts that they have not yet subscribed to. So far I've come up with a nested select to collect the foreign keys in my Subscription table, but I'm stuck with how to use this information to get contracts a person doesn't have.
SELECT s.PersonId as pId, s.ContractID as cId
FROM dbo.Subscription AS s
FULL OUTER JOIN dbo.Person as p ON s.PersonId = p.Id
FULL OUTER JOIN dbo.Contract as c ON s.ContractID = c.Id
WHERE p.Id IN (SELECT PersonId FROM dbo.Subscription)
Pseudocode of what I want to do:
Get Persons that have Contracts
For each Person, get contract they don't have
Display Persons and each missing contract for Person
Schema (edited to remove business info):
CREATE TABLE [dbo].[Contract]
(
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY,
[ContractNumber] NUMERIC(16) NULL
)
CREATE TABLE [dbo].[Person]
(
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY,
Name nvarchar(200) NOT NULL
)
CREATE TABLE [dbo].[Subscription]
(
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY,
[PersonID] UNIQUEIDENTIFIER NOT NULL,
[ContractID] UNIQUEIDENTIFIER NOT NULL,
CONSTRAINT [FK_Subscription_Person] FOREIGN KEY ([PersonID]) REFERENCES [Person]([Id]),
CONSTRAINT [FK_Subscription_Contract] FOREIGN KEY ([ContractID]) REFERENCES [Contract]([Id])
)
Here's a cross join and not exists solution:
SELECT p.Id as pId, c.ID as cId
from dbo.Person as p
cross join dbo.Contract as c
WHERE p.Id IN (SELECT PersonId FROM dbo.Subscription as s1)
and not exists(select 1 from dbo.Subscription as s2 where s2.PersonId = p.Id and s2.ContractID = c.Id)
Get Persons that have Contracts
You already did this correctly with WHERE p.Id IN (SELECT PersonId FROM dbo.Subscription as s1)
For each Person, get contract they don't have
First we take all combinations with cross join, then we filter out the ones you don't want with not exists
For (3.) We just select what we want
use left join
SELECT p.*,s.*,c.*
FROM dbo.Person as p
left OUTER JOIN dbo.Subscription AS s ON s.PersonId = p.Id
left join Contract c on s.ContractID=c.Id

SQL INNER JOIN entity

I want to execute this query :
-- The most expensive item sold ever
SELECT
c.itemID, c.itemName
FROM
item AS c
JOIN
(SELECT
b.itemID as 'itemid', MAX(b.item_initialPrice) AS 'MaxPrice'
FROM
buyeritem AS a
INNER JOIN
item AS b ON a.item_ID = b.itemID) AS d ON c.itemID = d.itemid
GROUP BY
c.itemID, c.itemName;
My item table looks like this:
create table item
(
itemID int IDENTITY(1000, 1) NOT NULL,
itemName varchar(15) NOT NULL,
Item_desc varchar(255),
Item_initialPrice MONEY,
ItemQty int,
ownerID int NOT NULL,
condition varchar(20) NOT NULL,
PRIMARY KEY (itemID),
FOREIGN KEY (ownerID) REFERENCES seller (sellerID)
);
The problem is that column item.itemID is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. I tried to add a group by clause at the end
group by c.itemID, c.itemName
but I still get the same error? I don't really know where the problem comes from.
I also have this query
-- The most active seller(the one who has offered the most number of items)
SELECT
a.ownerID, b.sellerName
FROM
item AS a
INNER JOIN
seller AS b ON a.ownerID = b.sellerID
GROUP BY
a.ownerID, b.sellerName
ORDER BY
COUNT(a.itemID) DESC;
I want to add itemQty along with the ownerID and sellerName from item table stated above, what would be the best way to achieve that?
Just write distinct instead of Group By as Group By will not work with out an aggregated function like sum,max etc. in select statement which is missing in your query.An example of this is second query which I have written
SELECT distinct c.itemID, c.itemName
FROM item AS c
JOIN (
SELECT b.itemID as itemid, MAX(b.item_initialPrice) AS MaxPrice FROM buyeritem AS a
INNER JOIN item AS b ON a.item_ID = b.itemID
GROUP BY b.itemID) as d
ON c.itemID = d.itemid ;
For second query
Select a.* from
(
SELECT a.ownerID, b.sellerName, count(distinct a.ITEM_ID) as item_qty
FROM item AS a
INNER JOIN seller AS b ON a.ownerID = b.sellerID
GROUP BY a.ownerID,b.sellerName
) a
order by item_qty DESC

How to handle a attribute with just two possible values

In my web application i have a company which can be a buyer or a supplier or both.
So my database tables would be like this
Company( id_company, ..., is_buyer, is_supplier, ... )
Or :
Company( id_company, ... )
Type_company( id_type_company, type )
Extra_table(id_company, id_type_company )
Or :
Company( id_company, ... )
Type_company( id_company, id_type_company, type )
I want a explication (Pros and Cons) for every case if it's possible.
You can consider using the common supertype like this
CREATE TABLE companies
(
id int not null primary key,
name varchar(128)
-- other columns
);
CREATE TABLE buyers
(
company_id int not null primary key,
foreign key (company_id) references companies (id)
);
CREATE TABLE suppliers
(
company_id int not null primary key,
foreign key (company_id) references companies (id)
);
Here are some sample queries:
-- Select all buyers
SELECT c.id, c.name
FROM companies c JOIN buyers b
ON c.id = b.company_id;
-- Select all suppliers
SELECT c.id, c.name
FROM companies c JOIN suppliers s
ON c.id = s.company_id;
-- Select companies that are both buers and suppliers
SELECT c.id, c.name
FROM companies c JOIN buyers b
ON c.id = b.company_id JOIN suppliers s
ON c.id = s.company_id;
-- Select companies that are buers BUT NOT suppliers
SELECT c.id, c.name
FROM companies c JOIN buyers b
ON c.id = b.company_id LEFT JOIN suppliers s
ON c.id = s.company_id
WHERE s.company_id IS NULL;
Here is SQLFiddle demo
Recommended reading:
SQL Antipatterns by #BillKarwin