I've searched several other SO questions but I still can't seem to get the right result. I would like to get all records from the Document table based on CaseId and having the most recent Status. I have two tables:
Document:
DocumentId | CaseId | Name
----------------------------------------
2 | 23 | Document 1
3 | 23 | Document 2
4 | 24 | Document 3
AuditLog:
AuditLogId | Status | DocumentId | Date Created
---------------------------------------------------------
10 | Active | 2 | 4/2/2017
11 | Draft | 2 | 4/1/2017
12 | Released | 2 | 4/3/2017
13 | Draft | 3 | 4/17/2017
14 | Draft | 4 | 4/17/2017
So the desired result for CaseId: 23 would be:
Status | DocumentId | CaseId | Name
----------------------------------------------
Released | 2 | 23 | Document 1
Draft | 3 | 23 | Document 2
I have got close with this query, however this only gives me the most recent of all results for CaseId 23, rather than grouping by DocumentId:
Select s.Status, lh.* from LegalHold lh join(
Select Status, LegalHoldId
FROM LegalHoldAuditLog
WHERE DateCreated = (select max(DateCreated)
from LegalHoldAuditLog)) s on lh.LegalHoldId = s.LegalHoldId
WHERE lh.CaseId = 23
using cross apply() to get the latest Status for each DocumentId.
select d.*, al.Status
from Document d
cross apply (
select top 1 i.Status
from AuditLog i
where i.DocumentId = d.DocumentId
order by i.date_created desc
) as al
where d.CaseId = 23
top with ties version using row_number() :
select top 1 with ties d.*, al.Status
from Document d
inner join AuditLog al
on d.DocumentId = al.DocumentId
order by row_number() over (partition by al.DocumentId order by al.date_created desc)
I think this would be at least one way to do it.
All you need is the value of Status from the latest row, according to the Date Created column in the AuditLog table.
SELECT (SELECT TOP 1 Status
FROM AuditLog
WHERE AuditLog.DocumentId = Document.DocumentId
ORDER BY [Date Created] DESC),
DocumentId, CaseId, Name
FROM Document
WHERE CaseId = 23
And shouldn't the value for Document 2 be "Draft"?
You can use a Common Table Expression with ROW_NUMBER in order to prioritize records of AuditLog table. Then join to Document table to get expected result:
;WITH AuditLog_Rn AS (
SELECT Status, DocumentId,
ROW_NUMBER() OVER (PARTITION BY DocumentId
ORDER BY [Date Created] DESC) AS rn
FROM AuditLog
)
SELECT d.DocumentId, d.CaseId, d.Name, al.Status
FROM Document AS d
JOIN AuditLog_Rn AS al ON d.DocumentId = al.DocumentId AND al.rn = 1
Related
We have 2 tables, bookings and docs
bookings
booking_id | name
100 | "Val1"
101 | "Val5"
102 | "Val6"
docs
doc_id | booking_id | doc_type_id
6 | 100 | 1
7 | 100 | 2
8 | 101 | 1
9 | 101 | 2
10 | 101 | 2
We need the result like this:
booking_id | doc_id
100 | 7
101 | 10
Essentially, we are trying to get the latest record of doc per booking, but if doc_type_id 2 is present, select the latest record of doc type 2 else select latest record of doc_type_id 1.
Is this possible to achieve with a performance friendly query as we need to apply this in a very huge query?
You can do it with FIRST_VALUE() window function by sorting properly the rows for each booking_id so that the rows with doc_type_id = 2 are returned first:
SELECT DISTINCT booking_id,
FIRST_VALUE(doc_id) OVER (PARTITION BY booking_id ORDER BY doc_type_id = 2 DESC, doc_id DESC) rn
FROM docs;
If you want full rows returned then you could use ROW_NUMBER() window function:
SELECT booking_id, doc_id, doc_type_id
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY booking_id ORDER BY doc_type_id = 2 DESC, doc_id DESC) rn
FROM docs
) t
WHERE rn = 1;
I have these two tables:
User:
=======================
id | Name | Email
=======================
1 | User-A| a#mail
2 | User-B| b#mail
=======================
Entry:
=================================================
id | agree | createdOn | userId
=================================================
1 | true | 2020-11-10 19:22:23 | 1
2 | false | 2020-11-10 22:22:23 | 1
3 | true | 2020-11-11 12:22:23 | 1
4 | true | 2020-11-04 22:22:23 | 2
5 | false | 2020-11-12 02:22:23 | 2
================================================
I need to get the following result:
=============================================================
Name | Email | agree | createdOn
=============================================================
User-A | a#mail | true | 2020-11-11 22:22:23
User-B | b#mail | false | 2020-11-12 02:22:23
=============================================================
The Postgres query I'm running is:
select distinct on (e."createdOn", u.id)
u.id , e.id ,u."Name" , u.email, e.agree, e."createdOn" from "user" u
inner join public.entry e on u."id" = e."userId"
order by "createdOn" desc
But the problem is that it returns all the entries after doing the join! where I only want the most recent entry by the createdOn cell.
You want the latest entry per user. For this, you need the user id in the distinct on clause, and no other column. This guarantees one row in the resultset per user.
Then, you need to put that column first in the order by clause, followed by createdOn desc. This breaks the ties and decides which row will be retained in each group:
select distinct on (u.id) u.id , e.id ,u."Name" , u.email, e.agree, e."createdOn"
from "user" u
inner join public.entry e on u."id" = e."userId"
order by u.id, "createdOn" desc
You can also use row_number to select the latest rows then do the join
SELECT * FROM USER A
LEFT JOIN (
SELECT * FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY USERID ORDER BY CREATEDON DESC) AS RN
FROM ENTRY
) K WHERE RN = 1
) B
ON A.ID = B.USERID
Try Having createdon = max(createdon) function. Group By User
I have the following code for example:
SELECT id, order_day, purchase_id FROM d
customer_id and purchase_id are unique. Each customer_id could have multiple purchase_id. Assume every one has made at least 5 orders.
Now, I just want to pull the first 5 purchase IDs of each customers ID (this depends on the earliest dates of purchases). I want the result to look like this:
id | purchase_id | rank
-------------------------
A | WERFEW43 | 1
A | ERTGDSFV | 3
A | FDGRT45 | 2
A | BRTE4TEW | 4
A | DFGDV | 5
B | DSFSF | 1
B | CF345 | 2
B | SDFSDFSDFS | 4
I thought of Ranking order_day, but my knowledge is not good enough to pull this off.
select id,purchase_id, rank() over (order by order_day)
from d
you also can try dense_rank() over (order by order_day) and row_number() over (order by order_day) and choose which one will be more suitable for you
select *
from
( SELECT
id
,order_day
,purchase_id
,row_number() -- ranking
over (partition by id -- each customer
order by order_day) as rn -- based on oldest dates
FROM d
) as dt
where rn <= 5
I have the following table:
| ID | ExecOrd | date |
| 1 | 1.0 | 3/4/2014|
| 1 | 2.0 | 7/7/2014|
| 1 | 3.0 | 8/8/2014|
| 2 | 1.0 | 8/4/2013|
| 2 | 2.0 |12/2/2013|
| 2 | 3.0 | 1/3/2014|
| 2 | 4.0 | |
I need to get the date of the top ExecOrd per ID of about 8000 records, and so far I can only do it for one ID:
SELECT TOP 1 date
FROM TABLE
WHERE DATE IS NOT NULL and ID = '1'
ORDER BY ExecOrd DESC
A little help would be appreciated. I have been trying to find a similar question to mine with no success.
There are several ways of doing this. A generic approach is to join the table back to itself using max():
select t.date
from yourtable t
join (select max(execord) execord, id
from yourtable
group by id
) t2 on t.id = t2.id and t.execord = t2.execord
If you're using 2005+, I prefer to use row_number():
select date
from (
select row_number() over (partition by id order by execord desc) rn, date
from yourtable
) t
where rn = 1;
SQL Fiddle Demo
Note: they will give different results if ties exist.
;with cte as (
SELECT id,row_number() over(partition by ID order byExecOrd DESC) r
FROM TABLE WHERE DATE IS NOT NULL )
select id from
cte where r=1
I have table with data something like this:
ID | RowNumber | Data
------------------------------
1 | 1 | Data
2 | 2 | Data
3 | 3 | Data
4 | 1 | Data
5 | 2 | Data
6 | 1 | Data
7 | 2 | Data
8 | 3 | Data
9 | 4 | Data
I want to group each set of RowNumbers So that my result is something like this:
ID | RowNumber | Group | Data
--------------------------------------
1 | 1 | a | Data
2 | 2 | a | Data
3 | 3 | a | Data
4 | 1 | b | Data
5 | 2 | b | Data
6 | 1 | c | Data
7 | 2 | c | Data
8 | 3 | c | Data
9 | 4 | c | Data
The only way I know where each group starts and stops is when the RowNumber starts over. How can I accomplish this? It also needs to be fairly efficient since the table I need to do this on has 52 Million Rows.
Additional Info
ID is truly sequential, but RowNumber may not be. I think RowNumber will always begin with 1 but for example the RowNumbers for group1 could be "1,1,2,2,3,4" and for group2 they could be "1,2,4,6", etc.
For the clarified requirements in the comments
The rownumbers for group1 could be "1,1,2,2,3,4" and for group2 they
could be "1,2,4,6" ... a higher number followed by a lower would be a
new group.
A SQL Server 2012 solution could be as follows.
Use LAG to access the previous row and set a flag to 1 if that row is the start of a new group or 0 otherwise.
Calculate a running sum of these flags to use as the grouping value.
Code
WITH T1 AS
(
SELECT *,
LAG(RowNumber) OVER (ORDER BY ID) AS PrevRowNumber
FROM YourTable
), T2 AS
(
SELECT *,
IIF(PrevRowNumber IS NULL OR PrevRowNumber > RowNumber, 1, 0) AS NewGroup
FROM T1
)
SELECT ID,
RowNumber,
Data,
SUM(NewGroup) OVER (ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
FROM T2
SQL Fiddle
Assuming ID is the clustered index the plan for this has one scan against YourTable and avoids any sort operations.
If the ids are truly sequential, you can do:
select t.*,
(id - rowNumber) as grp
from t
Also you can use recursive CTE
;WITH cte AS
(
SELECT ID, RowNumber, Data, 1 AS [Group]
FROM dbo.test1
WHERE ID = 1
UNION ALL
SELECT t.ID, t.RowNumber, t.Data,
CASE WHEN t.RowNumber != 1 THEN c.[Group] ELSE c.[Group] + 1 END
FROM dbo.test1 t JOIN cte c ON t.ID = c.ID + 1
)
SELECT *
FROM cte
Demo on SQLFiddle
How about:
select ID, RowNumber, Data, dense_rank() over (order by grp) as Grp
from (
select *, (select min(ID) from [Your Table] where ID > t.ID and RowNumber = 1) as grp
from [Your Table] t
) t
order by ID
This should work on SQL 2005. You could also use rank() instead if you don't care about consecutive numbers.