SQL - alternative - sql

The below query gives me list of count values:
SELECT books.name as a,
COUNT(library.staff)as b
FROM library, books
WHERE library.staff = books.id
GROUP BY books.name;
How do i get the max value of output

Without window functions or limit:
with data as (
select books.name as a,
COUNT(library.staff) as b
FROM library
JOIN books ON library.staff = books.id
GROUP BY books.name
)
select *
from data
where b = (select max(b) from data);
But maybe you'll now add another requirement that says "no common table expressions":
select books.name as a,
count(library.staff) as b
from library l
join books b on l.staff = b.id
group by b.name
having count(l.staff) = (select max(cnt)
from (
select count(*) cnt
from library l
join books b on l.staff = b.id
group by b.name) t
);
If books.id is a primary (or unique) key, the second statement can be slightly simplified using:
select l.name as a,
count(l.staff) as b
from library l
join books ON l.staff = b.id
group by b.name
having count(l.staff) = (select max(cnt)
from (
select count(*) cnt
from library l
group by l.staff) t
);

Can you use window functions?
SELECT a, b
FROM (
SELECT books.name as a,
count(library.staff) as b
row_number() OVER (ORDER BY count(library.staff) DESC) as rn
FROM library, books
WHERE library.staff = books.id
GROUP BY books.name
) s
WHERE rn=1;

Related

SQL - Sum of query not true

I have this query;
SELECT l.Name, COALESCE(SUM(A.Count), 0) AS A, COALESCE(SUM(B.Count), 0) AS B
FROM List l
LEFT JOIN A ON A.Name = l.Name
LEFT JOIN B ON B.Name = l.Name
GROUP BY l.Name
ORDER BY l.Name
And query results not true.
Sum of Product3 in Table A is not true.
Demo : https://www.db-fiddle.com/f/rdKLkyaeEsi8bPcNPkUnTE/4
You could sum separately for A and B and then combine results:
SELECT Name, MAX(A) AS A, MAX(B) AS B
FROM (
SELECT l.Name, SUM(A.Count) AS A, 0 AS B
FROM List l
LEFT JOIN A ON A.Name = l.Name
GROUP BY l.Name
UNION ALL
SELECT l.Name, 0 AS A, SUM(B.Count)AS B
FROM List l
LEFT JOIN B ON B.Name = l.Name
GROUP BY l.Name) sub
GROUP BY Name
ORDER BY Name;
db-fiddle.com demo
You should be aggregating the A and B tables in separate subqueries:
SELECT
l.Name,
COALESCE(a.cnt, 0) AS a_cnt,
COALESCE(b.cnt, 0) AS b_cnt
FROM List l
LEFT JOIN
(
SELECT Name, SUM(Count) AS cnt
FROM A
GROUP BY Name
) a
ON l.Name = a.name
LEFT JOIN
(
SELECT Name, SUM(Count) AS cnt
FROM B
GROUP BY Name
) b
ON l.Name = b.name;
The problem with your current approach is that the double join to the A and B tables is likely resulting in double counting. By using separate subqueries we avoid this problem.
In your original question on this subject, I suggested correlated subqueries. These are probably the easiest way to accomplish what you want:
select l.name,
(select sum(a.count)
from a
where a.name = l.name
) as a,
(select sum(b.count)
from b
where b.name = l.name
) as b
from list l;
You should check null values before sum() not after.
SELECT l.Name, SUM(COALESCE(A.Count, 0)) AS A, SUM(COALESCE(B.Count, 0)) AS B
FROM List l
LEFT JOIN A ON A.Name = l.Name
LEFT JOIN B ON B.Name = l.Name
GROUP BY l.Name
ORDER BY l.Name

SQL / Redshift Error - Filtering only first occurence of a value

I am trying to return the first occurence of a value using the below SQL script:
SELECT a.store_id, b.store_name, c.prod_id FROM
(
select a.store_id as sh, b.store_name, c.prod_id
count(1) ct,row_number() over (partition by store_id order by store_id
desc) row_num
from stores a, store_details b , product c
where a.id=b.store_id and a.prod_id = c.id and b.id = 101
group by a.store_id as sh, b.store_name, c.prod_id,
)t
WHERE row_num = 1;
I get an error
Invalid operation : relation "a" does not exist.
I am using a Redshift DB. Could anyone assist on this. Thanks..
You are selecting from a subquery, and the alias for it is "T", you cannot reference subquery alias from outside the subquery.
SELECT t.store_id, T.store_name, T.prod_id
FROM
(
select a.store_id as sh, b.store_name, c.prod_id
count(1) ct,row_number() over (partition by store_id order by store_id
desc) row_num
from stores a, store_details b , product c
where a.id=b.store_id and a.prod_id = c.id and b.id = 101
group by a.store_id as sh, b.store_name, c.prod_id,
)T
WHERE t.row_num = 1;

Using SQL how can I write a query to find the top 5 per category per month?

I am trying to get the Top 5 rows with the highest number for each category for a specific time interval such as a month. What I currently have returns 5 of the exact same descriptions for a category. I am trying to get the top five. This only happens when I try to sort it based on a time period.
WITH CustomerRank
AS
(SELECT
Count(*) AS "Count",
d.Item,
d.Description,
Name,
i.Type,
d.CreatedOn
FROM [dbo].i,
d,
dbo.b,
as,
a,
c
WHERE d.Inspection_Id = i.Id AND d.Inspection_Id = i.Id AND
b.Id = i.BuildingPart_Id AND b.as= Assessments.Id
AND as.Application_Id = a.Id AND a.Customer_Id = Customers.Id
group by d.Item, d.Description, Name, i.Type, d.CreatedOn
)
select * from (
SELECT "Count",Item,Description,Type,ROW_NUMBER() Over (PARTITION BY Name order by "Count" desc) AS RowNum, Name, CreatedOn
FROM CustomerRank
where CreatedOn > '2017-1-1 00:00:00'
) s where RowNum <6
Cheers
Try something like this:
WITH CustomerRank
AS
(SELECT
Count(*) AS "Count",
d.Item, d.Description, Name, i.Type
FROM dbo.Inspection i
INNER JOIN dbo.Details d ON d.Inspection_Id = i.Id
INNER JOIN dbo.BuildingParts b ON b.Id = i.BuildingPart_Id
INNER JOIN dbo.Assessments a ON a.Id = b.AssessmentId
INNER JOIN dbo.Applications ap ON ap.Id = a.Application_Id
INNER JOIN dbo.Customers c ON c.Id = a.Customer_Id
where CreatedOn > '2017-1-1 00:00:00'
group by d.Item, d.Description, Name, i.Type
)
select * from (
SELECT "Count",Item,Description,Type,ROW_NUMBER() Over (PARTITION BY Name order by "Count" desc) AS RowNum, Name
FROM CustomerRank
) s where RowNum <6
The idea is that the CreatedOn column must be removed from the GROUP BY clause (because if you keep it there, we would get a different row for each value of the CreatedOn column).
Also, it's better to use JOIN-s and aliases for each table.

SQL MAX of SUM without sub-querying

I'm trying to execute a SQL query which requires grouping by MAX of SUM calculation (in PostgreSQL).
I found here some solutions which uses sub-querying but I need the solution without it (if it's possible).
Query:
SELECT "Festival".title,
"Musician".aname,
SUM("Musician".salary * "Musician".percentage / 100) AS "agent_total_profit"
FROM "Festival"
INNER JOIN "Booked"
ON "Booked".title = "Festival".title
INNER JOIN "Musician"
ON "Musician".id = "Booked".id
GROUP BY "Festival".title,
"Musician".aname
ORDER BY "Festival".title
Result:
the result is not as expected, I want to find for each festival title, the musician aname with the max agent_total_profit.
Desired result:
Thanks in advance.
Use DISTINCT ON:
SELECT DISTINCT ON (f.title) f.title, m.aname,
SUM(m.salary * m.percentage / 100) AS "agent_total_profit"
FROM "Festival" f INNER JOIN
"Booked" b
ON b.title = f.title INNER JOIN
"Musician" m
ON m.id = b.id
GROUP BY f.title, m.aname
ORDER BY f.title, "agent_total_profit" DESC;
The more traditional SQL method uses row_number():
SELECT f.*
FROM (SELECT f.title, m.aname,
SUM(m.salary * m.percentage / 100) AS "agent_total_profit",
ROW_NUMBER() OVER (PARTITION BY f.title ORDER BY SUM(m.salary * m.percentage / 100) DESC) as seqnum
FROM "Festival" f INNER JOIN
"Booked" b
ON b.title = f.title INNER JOIN
"Musician" m
ON m.id = b.id
GROUP BY f.title, m.aname
) f
WHERE seqnum = 1;
ORDER BY f.title, "agent_total_profit" DESC;

Problems with SQL Inner join

Having some problems while trying to optimize my SQL.
I got 2 tables like this:
Names
id, analyseid, name
Analyses
id, date, analyseid.
I want to get the newest analyse from Analyses (ordered by date) for every name (they are unique) in Names. I can't really see how to do this without using 2 x nested selects.
My try (Dont get confused about the names. It's the same principle):
SELECT
B.id,
B.chosendatetime,
vStockNames.name
FROM
vStockNames
INNER JOIN
(
SELECT TOP 1
vAnalysesHistory.id,
vAnalysesHistory.chosendatetime,
vAnalysesHistory.companyid
FROM
vAnalysesHistory
ORDER BY
vAnalysesHistory.chosendatetime DESC
) AS B
ON
B.companyid = vStockNames.stockid
In my example the problem is that i only get 1 row returned (because of top 1). But if I exclude this, I can get multiple analyses of the same name.
Can you help me ? - THanks in advance.
SQL Server 2000+:
SELECT (SELECT TOP 1
a.id
FROM vAnalysesHistory AS a
WHERE a.companyid = n.stockid
ORDER BY a.chosendatetime DESC) AS id,
n.name,
(SELECT TOP 1
a.chosendatetime
FROM vAnalysesHistory AS a
WHERE a.companyid = n.stockid
ORDER BY a.chosendatetime DESC) AS chosendatetime
FROM vStockNames AS n
SQL Server 2005+, using CTE:
WITH cte AS (
SELECT a.id,
a.date,
a.analyseid,
ROW_NUMBER() OVER(PARTITION BY a.analyseid
ORDER BY a.date DESC) AS rk
FROM ANALYSES a)
SELECT n.id,
n.name,
c.date
FROM NAMES n
JOIN cte c ON c.analyseid = n.analyseid
AND c.rk = 1
...without CTE:
SELECT n.id,
n.name,
c.date
FROM NAMES n
JOIN (SELECT a.id,
a.date,
a.analyseid,
ROW_NUMBER() OVER(PARTITION BY a.analyseid
ORDER BY a.date DESC) AS rk
FROM ANALYSES a) c ON c.analyseid = n.analyseid
AND c.rk = 1
You're only asking for the TOP 1, so that's all you're getting. If you want one per companyId, you need to specify that in the SELECT on vAnalysesHistory. Of course, JOINs must be constant and do not allow this. Fortunately, CROSS APPLY comes to the rescue in cases like this.
SELECT
B.id,
B.chosendatetime,
vStockNames.name
FROM
vStockNames
CROSS APPLY
(
SELECT TOP 1
vAnalysesHistory.id,
vAnalysesHistory.chosendatetime,
vAnalysesHistory.companyid
FROM
vAnalysesHistory
WHERE companyid = vStockNames.stockid
ORDER BY
vAnalysesHistory.chosendatetime DESC
) AS B
You could also use ROW_NUMBER() to do the same:
SELECT
B.id,
B.chosendatetime,
vStockNames.name
FROM
vStockNames
INNER JOIN
(
SELECT
vAnalysesHistory.id,
vAnalysesHistory.chosendatetime,
vAnalysesHistory.companyid,
ROW_NUMBER() OVER (PARTITION BY companyid ORDER BY chosendatetime DESC) AS row
FROM
vAnalysesHistory
) AS B
ON
B.companyid = vStockNames.stockid AND b.row = 1
Personally I'm a fan of the first approach. It will likely be faster and is easier to read IMO.
Will something like this work for you?
;with RankedAnalysesHistory as
(
SELECT
vah.id,
vah.chosendatetime,
vah.companyid
,rank() over (partition by vah.companyid order by vah.chosendatetime desc) rnk
FROM
vAnalysesHistory vah
)
SELECT
B.id,
B.chosendatetime,
vsn.name
FROM
vStockNames vsn
join RankedAnalysesHistory as rah on rah.companyid = vsn.stockid and vah.rnk = 1
It seems to me that you only need SQL-92 for this. Of course, explicit documentation of the joining columns between the tables would help.
Simple names
SELECT B.ID, C.ChosenDate, N.Name
FROM (SELECT A.AnalyseID, MAX(A.Date) AS ChosenDate
FROM Analyses AS A
GROUP BY A.AnalyseID) AS C
JOIN Analyses AS B ON C.AnalyseID = B.AnalyseID AND C.ChosenDate = B.Date
JOIN Names AS N ON N.AnalyseID = C.AnalyseID
The sub-select generates the latest analysis for each company; the join with Analyses picks up the Analyse.ID value corresponding to that latest analysis, and the join with Names picks up the company name. (The C.ChosenDate in the select-list could be replaced by B.Date AS ChosenDate, of course.)
Complicated names
SELECT B.ID, C.ChosenDateTime, N.Name
FROM (SELECT A.CompanyID, MAX(A.ChosenDateTime) AS ChosenDateTime
FROM vAnalysesHistory AS A
GROUP BY A.CompanyID) AS C
JOIN vAnalysesHistory AS B ON C.CompanyID = B.CompanyID
AND C.ChosenDateTime = B.ChosenDateTime
JOIN vStockNames AS N ON N.AnalyseID = C.AnalyseID
Same query with systematic renaming (and slightly different layout to avoid horizontal scrollbars).