How to select max rownumber for each partition in SQL Server - sql

Can anybody tell me how to select the max row number for each partition in SQL Server using CTE.
Suppose any employee is having 4 transaction rows and another is having only one row then how to select max rows for those employees.
I am having job table I want to fetch max row number for employee to fetch the latest transaction for that employee
I'd tried following
With CTE as (
Select
My fields,
Rownum = row_number() over(partition by emplid order by date) from jobtable
Where
Myconditions
)
Select * from CTE B left outer join
CTE A on A.emplid = B.emplid
Where
A.rownum = (select max(a2.rownum) from jobtable a2)
Do left join is required above or it is not at all needed ?
Please tell me how to fetch rownum if only 1 row exist for any employees as above query is fetching only employees which are having.greatest rownum in whole table

With CTE as (
Select
My fields,
Rownum = row_number() over(partition by emplid order by date DESC)
from jobtable
Where
Myconditions
)
SELECT *
FROM
cte
WHERE
RowNum = 1
Just reverse the order of your ROW_NUMBER and and select where it equals 1. Row numbers can be ascending (ASC) or descending (DESC). So if you want the most recent date to get the latest record ORDER BY date DESC, if you want the earliest record first you would choose ORDER BY date ASC (or date)

Related

Select every second record then determine earliest date

I have table that looks like the following
I have to select every second record per PatientID that would give the following result (my last query returns this result)
I then have to select the record with the oldest date which would be the following (this is the end result I want)
What I have done so far: I have a CTE that gets all the data I need
WITH cte
AS
(
SELECT visit.PatientTreatmentVisitID, mat.PatientMatchID,pat.PatientID,visit.RegimenDate AS VisitDate,
ROW_NUMBER() OVER(PARTITION BY mat.PatientMatchID, pat.PatientID ORDER BY visit.VisitDate ASC) AS RowNumber
FROM tblPatient pat INNER JOIN tblPatientMatch mat ON mat.PatientID = pat.PatientID
LEFT JOIN tblPatientTreatmentVisit visit ON visit.PatientID = pat.PatientID
)
I then write a query against the CTE but so far I can only return the second row for each patientID
SELECT *
FROM
(
SELECT PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate, RowNumber FROM cte
) as X
WHERE RowNumber = 2
How do I return the record with the oldest date only? Is there perhaps a MIN() function that I could be including somewhere?
If I follow you correctly, you can just order your existing resultset and retain the top row only.
In standard SQL, you would write this using a FETCH clause:
SELECT *
FROM (
SELECT
visit.PatientTreatmentVisitID,
mat.PatientMatchID,
pat.PatientID,
visit.RegimenDate AS VisitDate,
ROW_NUMBER() OVER(PARTITION BY mat.PatientMatchID, pat.PatientID ORDER BY visit.VisitDate ASC) AS rn
FROM tblPatient pat
INNER JOIN tblPatientMatch mat ON mat.PatientID = pat.PatientID
LEFT JOIN tblPatientTreatmentVisit visit ON visit.PatientID = pat.PatientID
) t
WHERE rn = 2
ORDER BY VisitDate
OFFSET 0 ROWS FETCH FIRST 1 ROW ONLY
This syntax is supported in Postgres, Oracle, SQL Server (and possibly other databases).
If you need to get oldest date from all selected dates (every second row for each patient ID) then you can try window function Min:
SELECT * FROM
(
SELECT *, MIN(VisitDate) OVER (Order By VisitDate) MinDate
FROM
(
SELECT PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate,
RowNumber FROM cte
) as X
WHERE RowNumber = 2
) Y
WHERE VisitDate=MinDate
Or you can use SELECT TOP statement. The SELECT TOP clause allows you to limit the number of rows returned in a query result set:
SELECT TOP 1 PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate FROM
(
SELECT *
FROM
(
SELECT PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate,
RowNumber FROM cte
) as X
WHERE RowNumber = 2
) Y
ORDER BY VisitDate
For simplicity add order desc on date column and use TOP to get the first row only
SELECT TOP 1 *
FROM
(
SELECT PatientTreatmentVisitID,PatientMatchID,PatientID, VisitDate, RowNumber FROM cte
) as X
WHERE RowNumber = 2
order by VisitDate desc

Remove duplicate records based on timestamp

I'm writing a query to find duplicate records. I have table with following columns
Id, Deliveries, TankId, Timestamp.
I have inserted duplicate records, that is for same tankid, same deliveries with the +1 day offset timestamp.
Now I want to remove duplicate records which is with lesser timestamp.
e.g. I have duplicate deliveries added for same tankid on 24th and 25th july. I need to remove 24th record.
I tried the following query;
SELECT raw.TimeStamp,raw.[Delivery],raw.[TankId]
FROM [dbo].[tObservationData] raw
INNER JOIN (
SELECT [Delivery],[TankSystemId]
FROM [dbo].[ObservationData]
GROUP BY [Delivery],[TankSystemId]
HAVING COUNT([ObservationDataId]) > 1
) dup
ON raw.[Delivery] = dup.[Delivery] AND raw.[TankId] = dup.[TankId]
AND raw.TimeStamp >'2019-06-30 00:00:00.0000000' AND raw.[DeliveryL]>0
ORDER BY [TankSystemId],TimeStamp
But above gives other records too, how can I find and delete those duplicate records?
In this case you can use partition by order by clause. You can partition by TankID and Delivery and order by Timestamp in desc order
Select * from (
Select *,ROW_NUMBER() OVER (PARTITION BY TankID,Delievry ORDER BY [Timestamp] DESC) AS rn
from [dbo].[ObservationData]
)
where rn = 1
In the above code records with rn=1 will have the latest timestamp. So you can only select those and ignore others. Also you can use the same to remove/delete the records from you table.
WITH TempObservationdata (TankID,Delivery,Timestamp)
AS
(
SELECT TankID,Delivery,ROW_NUMBER() OVER(PARTITION by TankID, Delivery ORDER BY Timsetamp desc)
AS Timestamp
FROM dbo.ObservationData
)
--Now Delete Duplicate Rows
DELETE FROM TempObservationdata
WHERE Timestamp > 1
think it will work
SELECT raw.TimeStamp,raw.[Delivery],raw.[TankId]
FROM [dbo].[tObservationData] raw
INNER JOIN (
SELECT [Delivery],[TankSystemId],min([TimeStamp]) as min_ts
FROM [dbo].[ObservationData]
GROUP BY [Delivery],[TankSystemId]
HAVING COUNT([ObservationDataId]) > 1
) dup
ON raw.[Delivery] = dup.[Delivery] AND raw.[TankId] = dup.[TankId] and raw.[TimeStamp] = dup.min_ts
AND raw.TimeStamp >'2019-06-30 00:00:00.0000000' AND raw.[DeliveryL]>0
ORDER BY [TankSystemId],TimeStamp
Are you just looking for this?
SELECT od.*
FROM (SELECT od.*,
ROW_NUMBER() OVER (PARTITION BY od.TankId, od.Delivery ORDER BY od.TimeStamp DESC) as seqnum
FROM [dbo].[tObservationData] od
) od
WHERE seqnum = 1;

Delete Duplicate Rows in SQL

I have a table with unique id but duplicate row information.
I can find the rows with duplicates using this query
SELECT
PersonAliasId, StartDateTime, GroupId, COUNT(*) as Count
FROM
Attendance
GROUP BY
PersonAliasId, StartDateTime, GroupId
HAVING
COUNT(*) > 1
I can manually delete the rows while keeping the 1 I need with this query
Delete
From Attendance
Where Id IN(SELECT
Id
FROM
Attendance
Where PersonAliasId = 15
and StartDateTime = '9/24/2017'
and GroupId = 1429
Order By ModifiedDateTIme Desc
Offset 1 Rows)
I am not versed in SQL enough to figure out how to use the rows in the first query to delete the duplicates leaving behind the most recent. There are over 3481 records returned by the first query to do this one by one manually.
How can I find the duplicate rows like the first query and delete all but the most recent like the second?
You can use a Common Table Expression to delete the duplicates:
WITH Cte AS(
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY PersonAliasId, StartDateTime, GroupId
ORDER BY ModifiedDateTIme DESC)
FROM Attendance
)
DELETE FROM Cte WHERE Rn > 1;
This will keep the most recent record for each PersonAliasId - StartDateTime - GroupId combination.
Use the MAX aggregate function to identify the latest startdatetime for each group/person combination. Then delete records which do not have that latest time.
DELETE a
FROM attendance as a
INNER JOIN (
SELECT
PersonAliasId, MAX(StartDateTime) AS LatestTime, GroupId,
FROM
Attendance
GROUP BY
PersonAliasId, GroupId
HAVING
COUNT(*) > 1
) as b
on a.personaliasid=b.personaliasid and a.groupid=b.groupid and a.startdatetime < b.latesttime
Same as the CTE answer - give Felix the check
delete
from ( SELECT rn = ROW_NUMBER() OVER(PARTITION BY PersonAliasId, StartDateTime, GroupId
ORDER BY ModifiedDateTIme DESC)
FROM Attendance
) tt
where tt.rn > 1

SQL - pull unique name with the lastest date and lowest value

how do i get unique name with the latest date and lowest value.
Name date value
brad 1/2/10 1.1
brad 1/2/10 2.3
bob 1/6/10 1.0
brad 2/4/09 13.2
this query does not seem to work
SELECT distinct
A.[ViralLoadMemberID]
,B.LastName
,B.FirstName
,A.[Date]
,A.[vaule]
FROM [t].[dbo].[tblViralLoad] A
left join [dbo].[tblEnrollees] B on A.ViralLoadMemberID = B.MemberID
where
A.Date =
(
select MAX(Date)
from dbo.tblViralLoad
where ViralLoadMemberID = A.ViralLoadMemberID
and
( Date >= '07/01/2014'
and Date <= '12/3/2014' ) )
The idea is to use order by and fetch only one row. If you want the lowest value on the latest date, the standard SQL would be:
select t.*
from table t
order by desc desc, value asc
fetch first 1 row only;
For older versions of SQL Server, you would omit the last line and do select top 1 * . . .. For MySQL, the last line would be limit 1.
Fun with rank()
declare #t as table (name varchar(50),dte date,val decimal(18,10));
insert into #t(name,dte,val) values
('Dave','1/1/2015',1.0),
('Dave','1/3/2015',1.2),
('Dave','1/4/2015',1.5),
('Dave','1/10/2015',1.3),
('Dave','1/15/2015',1.2),
('Steve','1/11/2015',1.6),
('Steve','1/12/2015',1.1),
('Steve','1/15/2015',1.2),
('Bill','1/21/2015',1.9),
('Ted','1/1/2015',1.8),
('Ted','1/10/2015',1.0),
('Ted','1/12/2015',1.7)
-- This will show the lowest prices by each person
select name,dte,val from (select name,dte,val, rank() over (partition by name order by val) as r from #t) as data where r = 1
-- This will be users lowest price and the last day they sublitted a prices regurdless if it is the lowest
select name,max(dte) as [last Date] ,min(val) as [Lowest Value] from #t group by name
-- Who had the lowest price last regurdless if they have raised there price later.
select top(1) name,dte [last lowest quote],val from (select name,dte,val, rank() over (order by val) as r from #t) as data where r = 1 order by dte desc
-- what is the lowest price cueently quoted reguarless who quoted it
select top(1) name,dte [best active quote],val from (select name,dte,val, rank() over (partition by name order by dte desc) as r from #t) as data where r = 1 order by val

Optimizing query with two MAX columns in the same table

I need to optimize below query
SELECT
Id, -- identity
CargoID,
[Status] AS CurrentStatus
FROM
dbo.CargoStatus
WHERE
id IN (SELECT TOP 1 ID
FROM dbo.CargoStatus CS
INNER JOIN STD.StatusMaster S ON CS.ShipStatusID = S.SatusID
WHERE CS.CargoID=CargoStatus.CargoID
ORDER BY YEAR([CS.DATE]) DESC, MONTH([CS.DATE]) DESC,
DAY([CS.DATE]) DESC, S.StatusStageNumber DESC)
There are two tables
CargoStatus, and
StatusMaster
Statusmaster has columns StatusID, StatusName, StatusStageNumber(int)
CargoStatus has columns ID, StatusID (FK StatusMaster StatusID column), Date
Is there any other better way of writing this query.
I want latest status for each cargo (only one entry per cargoID).
Since you seem to be using SQL Server 2005 or newer, you can use a CTE with the ROW_NUMBER() windowing function:
;WITH LatestCargo AS
(
SELECT
cs.Id, -- identity
cs.CargoID,
cs.[Status] AS CurrentStatus
ROW_NUMBER() OVER(PARTITION BY cs.CargoID
ORDER BY cs.[Date], s.StatusStageNumber DESC) AS 'RowNum'
FROM
dbo.CargoStatus cs
INNER JOIN
STD.StatusMaster s ON cs.ShipStatusID = s.[StatusID]
)
SELECT
Id, CargoID, [Status]
FROM
LatestCargo
WHERE
RowNum = 1
This CTE "partitions" your data by CargoID, and for each partition, the ROW_NUMBER function hands out sequential numbers, starting at 1 and ordered by Date DESC - so the latest row gets RowNum = 1 (for each CargoID) which is what I select from the CTE in the SELECT statement after it.