Multiple grouping by

Multiple grouping by - sql

Lets say I have a table with business listings.
I'd like my result set to show in this order, but with one query:
First: Show the 5 most recently created listings in order of created_date.
Then: Show the rest of them but in random order.
My fields:
[BusinessName]
[Date_Created]
So if I had 100 businesses in the table, I want the list to show the 5 most recently created ones, and then show the rest but in random order.
Thank you in advance for your help!

Option A - Separating top 5 and the rest into two sub-queries and selecting them with UNION
WITH CTE_TOP5 AS
(
SELECT TOP 5 BusinessName, Date_Created, ROW_NUMBER() OVER (ORDER BY DATE_CREATED DESC) RN FROM dbo.YourTable
ORDER BY Date_Created DESC
)
, CTE_REST AS
(
SELECT BusinessName, Date_Created FROM dbo.YourTable
EXCEPT
SELECT BusinessName, Date_Created FROM CTE_TOP5
)
,CTE_RESTRANDOM AS
(
SELECT BusinessName, Date_Created, ROW_NUMBER() OVER (ORDER BY NEWID()) + 5 RN FROM CTE_REST
)
SELECT * FROM CTE_TOP5
UNION ALL
SELECT * FROM CTE_RESTRANDOM
ORDER BY RN
Option B - CASE in ORDER BY
;WITH CTE_TOP5 AS
(
SELECT TOP 5 *, ROW_NUMBER() OVER (ORDER BY DATE_CREATED DESC) RN FROM dbo.YourTable
ORDER BY Date_Created DESC
)
SELECT yt.*
FROM dbo.YourTable yt
LEFT JOIN CTE_TOP5 t5 ON yt.BusinessName = t5.BusinessName
AND yt.Date_Created = t5.Date_Created
ORDER BY CASE WHEN t5.RN IS NOT NULL THEN t5.RN ELSE 6 END, NEWID()
Option C - Similar like B, but no CTE, ROW_NUMBERS and JOINS - whole logic goes in ORDER BY
SELECT *
FROM dbo.YourTable yt
ORDER BY CASE WHEN yt.Date_Created IN (SELECT TOP 5 yt2.Date_Created
FROM dbo.YourTable yt2
ORDER BY yt2.Date_Created DESC)
THEN yt.Date_Created
ELSE '1900-01-01'
END DESC, NEWID()

Related

Select every second record then determine earliest date

I have table that looks like the following
I have to select every second record per PatientID that would give the following result (my last query returns this result)
I then have to select the record with the oldest date which would be the following (this is the end result I want)
What I have done so far: I have a CTE that gets all the data I need
WITH cte
AS
(
SELECT visit.PatientTreatmentVisitID, mat.PatientMatchID,pat.PatientID,visit.RegimenDate AS VisitDate,
ROW_NUMBER() OVER(PARTITION BY mat.PatientMatchID, pat.PatientID ORDER BY visit.VisitDate ASC) AS RowNumber
FROM tblPatient pat INNER JOIN tblPatientMatch mat ON mat.PatientID = pat.PatientID
LEFT JOIN tblPatientTreatmentVisit visit ON visit.PatientID = pat.PatientID
)
I then write a query against the CTE but so far I can only return the second row for each patientID
SELECT *
FROM
(
SELECT PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate, RowNumber FROM cte
) as X
WHERE RowNumber = 2
How do I return the record with the oldest date only? Is there perhaps a MIN() function that I could be including somewhere?

If I follow you correctly, you can just order your existing resultset and retain the top row only.
In standard SQL, you would write this using a FETCH clause:
SELECT *
FROM (
SELECT
visit.PatientTreatmentVisitID,
mat.PatientMatchID,
pat.PatientID,
visit.RegimenDate AS VisitDate,
ROW_NUMBER() OVER(PARTITION BY mat.PatientMatchID, pat.PatientID ORDER BY visit.VisitDate ASC) AS rn
FROM tblPatient pat
INNER JOIN tblPatientMatch mat ON mat.PatientID = pat.PatientID
LEFT JOIN tblPatientTreatmentVisit visit ON visit.PatientID = pat.PatientID
) t
WHERE rn = 2
ORDER BY VisitDate
OFFSET 0 ROWS FETCH FIRST 1 ROW ONLY
This syntax is supported in Postgres, Oracle, SQL Server (and possibly other databases).

If you need to get oldest date from all selected dates (every second row for each patient ID) then you can try window function Min:
SELECT * FROM
(
SELECT *, MIN(VisitDate) OVER (Order By VisitDate) MinDate
FROM
(
SELECT PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate,
RowNumber FROM cte
) as X
WHERE RowNumber = 2
) Y
WHERE VisitDate=MinDate
Or you can use SELECT TOP statement. The SELECT TOP clause allows you to limit the number of rows returned in a query result set:
SELECT TOP 1 PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate FROM
(
SELECT *
FROM
(
SELECT PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate,
RowNumber FROM cte
) as X
WHERE RowNumber = 2
) Y
ORDER BY VisitDate

For simplicity add order desc on date column and use TOP to get the first row only
SELECT TOP 1 *
FROM
(
SELECT PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate, RowNumber FROM cte
) as X
WHERE RowNumber = 2
order by VisitDate desc

SQL select row with max value or distinct value and sum all

I have the following data that is returned to me. I need to get a distinct or max sum of all the commission by taxid for a single repnbr. The 'qtrlycommrep' column is the value I'm trying to get to, but not able to. For repnbr c590, I need to get the 854.66 commission amount, which is the max for each taxid.
What am I doing wrong?
Any help would be much appreciated!
Here's what I've tried so far. Using the Row_number
select distinct
sub.Repnbr
, (sub.QtrLYComm) as qtrlycommrep
from (
select distinct repnbr, QtrLYComm
, rn = row_number() over(partition by repnbr order by QtrLYComm desc)
from #qtrly
) sub
where sub.rn = 1
Cross Apply
select distinct
#qtrly.repnbr
, x.QtrLYComm as qtrlycommrep
from #qtrly
cross apply (
select top 1
*
from #qtrly as i
where i.repnbr = Repnbr
order by i.qtrlycomm desc
) as x;
inner join
select
#qtrly.repnbr, #qtrly.qtrlycomm as qtrlycommrep
from #qtrly
inner join (
select maxvalue = max(qtrlycomm), repnbr
from #qtrly
group by repnbr
) as m
on #qtrly.repnbr = m.repnbr
and #qtrly.qtrlycomm = m.maxvalue;
order by row_number
select top 1 with ties
#qtrly.repnbr, #qtrly.qtrlycomm as qtrlycommrep
from #qtrly
order by
row_number() over(partition by repnbr
order by qtrlycomm desc)

You want one value per tax id. You need to include that. For instance:
select q.Repnbr, sum(q.QtrLYComm) as qtrlycommrep
from (select q.*,
row_number() over(partition by repnbr, taxid order by QtrLYComm desc) as seqnum
from #qtrly q
) q
where seqnum = 1
group by q.Repnbr;
However, I would be inclined to use two levels of aggregation:
select q.Repnbr, sum(q.QtrLYComm) as qtrlycommrep
from (select distinct repnbr, taxid, QtrLYComm
from #qtrly q
) q
group by q.Repnbr;

SQL Server Partition Order - No tie DenseRank values even if rows are same

This question is best explained with an image and the script I have currently... How can I extract a FULL one row per assignment, with the lowest rank, and if there are 2 rows with a denserank as 1, then choose either of them?...
select *
,Dense_RANK() over (partition by [Assignment] order by [Text] desc) as
[DenseRank]
from [dbo].[CLEANSED_T3B_Step1_Res_Withdups____CP]
select * from
(
select *
,Dense_RANK() over (partition by [Assignment] order by [Text] desc, NewID()
) as [DenseRank] from [dbo].[CLEANSED_T3B_Step1_Res_Withdups____CP]
) as A
where A.[DenseRank] = 1
Second script is working perfectly!
SELECT * INTO
[dbo].[CLEANSED_T3B_Step1_COMPLETED]
from
(
select *
,Dense_RANK() over (partition by [Assignment] order by
left([Text],1) desc , [Diff_Doc_Clearing_Date] desc , [Amount] asc
as [DenseRank]
from [dbo].[CLEANSED_T3B_Step1_Res_Withdups____CP]
)
as A
where A.[DenseRank] = 1
No longer need just a random first Tied '1st place', now need to get the one with the highest day diff and then also the highest amount after. SO have adapted everything in this version 3.

It seems you don't want to use DENSE_RANK but ROW_NUMBER.
with cte as(
select t.*, rn = row_number() over(partition by assignment order by [text] desc)
from tablename t
)
select * from cte
where rn = 1

Order by 'newid()' as the 'tie-breaker'
Order by [Text],Newid()

How to select both row_number and count over partition?

I need to find duplicate record (with master record id and duplicate record ids):
select ciid, name from (
select ciid, name, row_number() over (
partition by related_id, name order by updatedate desc) rn
) where rn = 1;
This gives me the master record IDs, but it also includes records without duplicates.
If I use
select ciid, name from (
select ciid, name, row_number() over (
partition by related_id, name order by updatedate desc) rn
) where rn > 1;
This gets me all the duplicate records, but not the master record.
I was wishing if I do something like:
select ciid, name from (
select ciid, name, row_number() over (
partition by related_id, name order by updatedate desc
) rn, count(*) over (
partition by related_id, name order by updatedate desc
) cnt
) where rn = 1 and cnt > 1;
But I was worried about the performance, or even is it actually doing what I want.
How do I get the master record only for the ones with duplicates? Please note that name is not unique column. Only ciid is unique.

I ended up using similar query in my question:
select ciid, name from (
select ciid, name, row_number() over (
partition by related_id, name order by updatedate desc
) rn, count(*) over (
partition by related_id, name desc
) cnt
) where rn = 1 and cnt > 1;
Works surprisingly well. The master record is where rn = 1 and duplicates are where rn > 1. Make sure count(*) over (partition ..) cannot have order by clause.

I haven't tested this (because I don't have real data and am too lazy to create some), but it seems something along these lines might work:
with has_duplicates as (
select related_id, name
from yourtable
group by related_id, name
having count (*) > 1
),
with_dupes as (
select
y.ccid, y.name,
row_number() over (partition by y.related_id, y.name order by y.updatedate desc) rn
from
yourtable y,
has_duplicates d
where
y.related_id = d.related_id and
y.name = d.name
)
select
ccid, name
from with_dupes
where rn = 1

select ciid, name
from (
select ciid, name,
dense_rank() over (partition by related_id, name order by updatedate desc) rn
from tablename) t
group by ciid,name
having count(distinct rn) > 1;
Edit: To find duplicates, why not just do this.
select x.ciid, x.name, x.updatedate
from tablename x join
(
select name, related_id, max(updatedate) as mxdt, count(*)
from tablename
group by name, related_id
having count(*) > 1
) t
on x.updatedate = t.mxdt and x.name = t.name
You can do a group by with having to select only those id's having more than one row with the same row number.

SQL Query, SELECT Top 2 by Foreign Key Order By Date

I need a SQL query that returns the top 2 Plans by PlanDate per ClientID. This is all on one table where PlanID is the PrimaryID, ClientID is a foreignID.
This is what I have so far -->
SELECT *
FROM [dbo].[tblPlan]
WHERE [PlanID] IN (SELECT TOP (2) PlanID FROM [dbo].[tblPlan] ORDER BY [PlanDate] DESC)
This, obviously, only returns 2 records where I actually need up to 2 records per ClientID.

This can be done using ROW_NUMBER:
SELECT PlanId, ClientId, PlanDate FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY ClientId ORDER BY PlanDate DESC) rn, *
FROM [dbo].[tblPlan]
) AS T1
WHERE rn <=2
Add any other columns you need to the select to get those too.

Edit, Dec 2011. Corrected CROSS APPLY solution
Try both to see what is best
SELECT *
FROM
( -- distinct ClientID values
SELECT DISTINCT ClientID
FROM [dbo].[tblPlan]
) P1
CROSS APPLY
( -- top 2 per ClientID
SELECT TOP (2) P2.PlanID
FROM [dbo].[tblPlan] P2
WHERE P1.ClientID = P2.ClientID
ORDER BY P2.[PlanDate] DESC
) foo
Or
;WITH cTE AS (
SELECT
*,
ROW_NUMBER () OVER (PARTITION BY clientid ORDER BY [PlanDate] DESC) AS Ranking
FROM
[dbo].[tblPlan]
)
SELECT * FROM cTE WHERE Ranking <= 2

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Multiple grouping by - sql

Related

Select every second record then determine earliest date

SQL select row with max value or distinct value and sum all

SQL Server Partition Order - No tie DenseRank values even if rows are same

How to select both row_number and count over partition?

SQL Query, SELECT Top 2 by Foreign Key Order By Date

Categories

Resources