How do I remove certain duplicates in a complex SQL query - sql

I am writing a query and need it to Remove all duplicates of a.GenUserID but also keep the most recent login date ( that is b.LogDateTime) but this date must be older than 6 months. If there are later dates, they have to be removed.
I hope this makes sense.
SELECT DISTINCT
a.GenUserID,
c.DeletionDate,
b.LogDateTime,
(CASE c.Disabled WHEN 0 THEN 'NO' else 'YES - ARCHIVED' end)
FROM RioReport.dbo.GenUser a
LEFT JOIN dbo.GenUserArchive c on a.GenUserID = c.GenUserID
LEFT JOIN dbo.GenUserAccessHistory b on a.GenUserID = b.ExtraInfo
WHERE(a.Disabled=0 or c.Disabled=0)
AND c.DeletionDate IS NOT NULL
AND ((DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime IS NULL))
ORDER BY a.GenUserID, b.LogDateTime desc

You could add the row_number() information to your query, and wrap that query into an outer query that just takes the records with number 1 from that result:
select *
from (
select a.GenUserID,
c.DeletionDate,
b.LogDateTime,
case c.Disabled when 0 then 'NO' else 'YES - ARCHIVED' end as diabled,
row_number() over (partition by a.GenUserID
order by b.LogDateTime desc) as rn
from RioReport.dbo.GenUser a
inner join dbo.GenUserArchive c
on a.GenUserID = c.GenUserID
left join dbo.GenUserAccessHistory b
on a.GenUserID = b.ExtraInfo
where (a.Disabled=0 or c.Disabled=0)
and c.DeletionDate is not null
and (DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime is null)
)
where rn = 1
order by a.GenUserID
Note that you can turn the first left join into an inner join without any change to the result set, since you have a non-null check on one of its fields. inner join is then preferred, and might give a performance improvement.
If GenUserAccessHistory.LogDateTime is always non-null, then you can avoid the test or b.LogDateTime is null by moving the DateAdd(MM, -6, GetDate()) > b.LogDateTime condition to the appropriate join on clause.
The generated row number will be given in order of descending LogDateTime values, and restart from 1 for every different user.
Alternative without window functions
row_number() and other window functions are supported since SQL Server 2008. In comments you write you cannot use it. If that is the case, here is an alternative using a common table expression (supported since SQL Server 2005):
;with cte as (
select a.GenUserID,
c.DeletionDate,
b.LogDateTime,
case c.Disabled when 0 then 'NO' else 'YES - ARCHIVED' end as disabled,
from RioReport.dbo.GenUser a
inner join dbo.GenUserArchive c
on a.GenUserID = c.GenUserID
left join dbo.GenUserAccessHistory b
on a.GenUserID = b.ExtraInfo
where (a.Disabled=0 or c.Disabled=0)
and c.DeletionDate is not null
and (DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime is null)
)
select *
from cte main
where LogDateTime is null
or not exists (select 1
from cte sub
where sub.GenUserID = main.GenUserID
and sub.LogDateTime > main.LogDateTime)
order by GenUserID

Try with the below query.
;WITH CTE_Group
AS(
SELECT
ROW_NUMBER() OVER (PARTITION BY a.GenUserID ORDER BY b.LogDateTime DESC) as RNO,
a.GenUserID,
c.DeletionDate,
b.LogDateTime,
(CASE c.Disabled WHEN 0 THEN 'NO' else 'YES - ARCHIVED' end) IsArchived
FROM RioReport.dbo.GenUser a
LEFT JOIN dbo.GenUserArchive c on a.GenUserID = c.GenUserID
LEFT JOIN dbo.GenUserAccessHistory b on a.GenUserID = b.ExtraInfo
WHERE(a.Disabled=0 or c.Disabled=0)
AND c.DeletionDate IS NOT NULL
AND ((DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime IS NULL)))
SELECT GenUserID,
DeletionDate,
LogDateTime,
IsArchived
FROM WITH_CTE_Group
WHERE RNO=1

Use cte and window function
;with ctr as (
select a.GenUserID, a.DeletionDate, a.LogDateTime
row_number()over(partition by a.GenUserID order by b.LogDateTime desc) rnk
from RioReport.dbo.GenUser a )
select a.GenUserID, a.DeletionDate, a.LogDateTime,
CASE WHEN DATEDIFF(mm,LogDateTime,getdate())<6 THEN 'NO' else 'YES - ARCHIVED' end)
from ctr a where a.rnk=1

Related

calculate time difference of consecutive row dates in SQL

Hello I am trying to calculate the time difference of 2 consecutive rows for Date (either in hours or Days), as attached in the image
Highlighted in Yellow is the result I want which is basically the difference of the date in that row and 1 above.
How can we achieve it in the SQL? Attached is my complex code which has the rest of the fields in it
with cte
as
(
select m.voucher_no, CONVERT(VARCHAR(30),CONVERT(datetime, f.action_Date, 109),100) as action_date,f.col1_Value,f.col3_value,f.col4_value,f.comments,f.distr_user,f.wf_status,f.action_code,f.wf_user_id
from attdetailmap m
LEFT JOIN awftaskfin f ON f.oid = m.oid and f.client ='PC'
where f.action_Date !='' and action_date between '$?datef' and '$?datet'
),
.*select *, ROW_NUMBER() OVER(PARTITION BY action_Date,distr_user,wf_Status,wf_user_id order by action_Date,distr_user,wf_Status,wf_user_id ) as row_no_1 from cte
cte2 as
(
select *, ROW_NUMBER() OVER(PARTITION BY voucher_no,action_Date,distr_user,wf_Status,wf_user_id order by voucher_no ) as row_no_1 from cte
)
select distinct(v.dim_value) as resid,c.voucher_no,CONVERT(datetime, c.action_Date, 109) as action_Date,c.col4_value,c.comments,c.distr_user,v.description,c.wf_status,c.action_code, c.wf_user_id,v1.description as name,r.rel_value as pay_office,r1.rel_value as site
from cte2 c
LEFT OUTER JOIN aagviuserdetail v ON v.user_id = c.distr_user
LEFT OUTER JOIN aagviuserdetail v1 ON v1.user_id = c.wf_user_id
LEFT OUTER JOIN ahsrelvalue r ON r.resource_id = v.dim_Value and r.rel_Attr_id = 'P1' and r.period_to = '209912'
LEFT OUTER JOIN ahsrelvalue r1 ON r1.resource_id = v.dim_Value and r1.rel_Attr_id = 'Z1' and r1.period_to = '209912'
where c.row_no_1 = '1' and r.rel_value like '$?site1' and voucher_no like '$?trans'
order by voucher_no,action_Date
The key idea is lag(). However, date/time functions vary among databases. So, the idea is:
select t.*,
(date - lag(date) over (partition by transaction_no order by date)) as diff
from t;
I should note that this exact syntax might not work in your database -- because - may not even be defined on date/time values. However, lag() is a standard function and should be available.
For instance, in SQL Server, this would look like:
select t.*,
datediff(second, lag(date) over (partition by transaction_no order by date), date) / (24.0 * 60 * 60) as diff_days
from t;

Query to do DATEDIFF between fields on different rows

I need to add a DATEDIFF to a query that gives me the hours between the current row's field, and the previous row's same field.
EDIT: {Should I ORDER the entire query by ROUTED_DTM DESC, as well as making the ORDER BY in the DATEDIFF DESC?
On one row I have a ROUTED_DTM of '2019-05-07 15:36:13.000', the row above has a ROUTED_DTM of '2019-05-01 14:19:52.000'. I would expect AGE_IN_ROLE_DAY, AGE_IN_ROLE_HR, AGE_IN_ROLE_MIN, AGE_IN_ROLE_SEC to be 6, 1, 16, and 21 (in order). However, I get 0, 0, 0, -2.}
SELECT c.ID,
c.PAID_DT,
DATEDIFF(dd,
CASE WHEN c.ID_ADJ_FROM = '' THEN c.RECD_DT ELSE c.INPUT_DT END,
CASE WHEN c.PAID_DT = '1/1/1753' THEN CONVERT(DATE,GETDATE()) ELSE c.PAID_DT END) + 1 AS DAYS_OLD
DATEDIFF(dd, h.ROUTED_DTM, LAG(h.ROUTED_DTM) OVER (ORDER BY h.ROUTED_DTM DESC)) AS AGE_IN_ROLE_DAY,
DATEDIFF(hh, h.ROUTED_DTM, LAG(h.ROUTED_DTM) OVER (ORDER BY h.ROUTED_DTM DESC)) AS AGE_IN_ROLE_HR,
DATEDIFF(MM, h.ROUTED_DTM, LAG(h.ROUTED_DTM) OVER (ORDER BY h.ROUTED_DTM DESC)) AS AGE_IN_ROLE_MIN,
DATEDIFF(ss, h.ROUTED_DTM, LAG(h.ROUTED_DTM) OVER (ORDER BY h.ROUTED_DTM DESC)) AS AGE_IN_ROLE_SEC,
h.QUEUE_ID,
h.QUEUE_DESC,
h.ROLE_ID,
h.ROLE_DESC,
h.ROUTED_DTM
FROM table1 c
LEFT JOIN table2 h
ON h.ID = c.ID
LEFT JOIN table 3 q
ON q.QUEUE_ID = h.QUEUE_ID
LEFT JOIN table4 r
ON r.ROLE_ID = h.ROLE_ID
ORDER BY c.ID, h.ROUTED_DTM DESC
I want to add a DATEDIFF(s) before the h.QUEUE_ID column that gives the difference between the current row's h.ROUTED_DTM, and the previous row's h.ROUTED_DTM
Currently, the query returns the correct results, however, I am not sure how to add the new DATEDIFF to each row.
You can use lag():
datediff(day, routed_dtm, lag(routed_dtm) over (order by routed_dtm))
You might also want partition by c.id in the window clause.

How to get the difference in dates in SQL Server

I'm having trouble with writing a query to get difference between the UpdateDate and the CreationDate of 2 records if the ID is the lowets and the difference between the most recent and second most recent UpdateDate.
Here's my Query:
SELECT
a.ID, a.RequestID, b.KrStatus, b.CrDate , b.UpdateDate,
DATEDIFF (HOUR, b.CrDate, b.UpdateDate) AS TimeDifference,
CASE WHEN a.ID = (SELECT MAX(a.ID) FROM [dbo].[Krdocs_hist] a WHERE a.RequestID = 1)
THEN 'YES'
ELSE 'NO'
END AS isMax,
CASE WHEN a.ID = (SELECT MIN(a.ID) FROM [dbo].[Krdocs_hist] a WHERE a.RequestID = 1)
THEN 'YES'
ELSE 'NO'
END AS isMi
FROM [dbo].[Krdocs_hist] a, [dbo].Krdocs_Details_hist b
WHERE
a.RequestId = b.RequestId
and a.ID = b.ID
and a.RequestId = 1
ORDER BY b.RequestID
Here's my current result:
What I'd like to do is get the last possible record, check to see if there was an existing one before it. If there wasn't compare the UpdateDate and CrDate (UpdateDate minus CrDate. If there was a record before this I want to do the UpdateDate minus the previous UpdateDate.
Using this query:
SELECT b.Id, b.RequestId, b.UpdateDate, b.KrStatus
FROM [dbo].[Krdocs_Details_hist] b
WHERE b.RequestId = 1
Has this result:
And using this query:
SELECT a.*
FROM [dbo].[Krdocs_hist] a
WHERE RequestId = 1
Has this result:
UPDATE
Since LAG is available from SQL 2012, you can use like below:
SELECT
ID,
RequestID,
CrDate,
UpdateDate,
KrStatus,
DATEDIFF(HOUR, PreviousUpdateDate, UpdateDate) as TimeDifference
FROM
(SELECT
ID,
RequestID,
CrDate,
UpdateDate,
KrStatus,
LAG(UpdateDate, 1, CrDate) OVER (ORDER BY YEAR(ID)) AS PreviousUpdateDate
FROM [dbo].Krdocs_Details_hist) as tmp
I think you can try like this:
SELECT
CASE
WHEN COUNT(*) <= 1 THEN DATEDIFF(HOUR,
(SELECT CrDate FROM [dbo].Krdocs_Details_hist),
(SELECT UpdateDate FROM [dbo].Krdocs_Details_hist))
WHEN COUNT(*) > 1 THEN DATEDIFF(HOUR,
(SELECT MAX(UpdateDate) FROM [dbo].Krdocs_Details_hist WHERE UpdateDate < ( SELECT MAX(UpdateDate) FROM [dbo].Krdocs_Details_hist)),
(SELECT MAX(UpdateDate) FROM [dbo].Krdocs_Details_hist))
END AS TimeDifference
FROM [dbo].Krdocs_Details_hist

Stored procedure only show 1 row per relationcode

I created a stored procedure which should only show 1 row per relationcode with the latest bookingdate. Now i got this currently:
USE [fms]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[spLoadNonBooking] (#DateFrom as DATE, #DateTill as DATE)
AS
BEGIN
WITH NonBooking (Relationcode, Companyname, LatestBooking, LatestContact)
AS
(
SELECT r.[RELATIONCODE], r.[COMPANYNAME], b.[BOOKINGDATE], c.[DATE]
FROM [fms].[dbo].[Relation] r
LEFT OUTER JOIN [fms].[dbo].[Booking] b
ON b.[RELATIONCODE] = r.[RELATIONCODE]
LEFT OUTER JOIN [fms].[dbo].[Communication] c
ON c.[RELATIONCODE] = r.[RELATIONCODE]
WHERE b.[BOOKINGDATE] < DATEADD(month, -2, GETDATE()) AND b.[BOOKINGDATE] > DATEADD(year, -1, GETDATE())
GROUP BY r.[RELATIONCODE], r.[COMPANYNAME], b.[BOOKINGDATE], c.[DATE]
)
SELECT Relationcode, Companyname, LatestBooking, LatestContact FROM NonBooking
END
But that currently shows me the data like this:
So it shows a line for every booking that is made, but i want 1 line for every relationcode with the LATEST bookingdate, but I am not sure how to do this, can anybody help me?
Use Row_Number() & Top 1 with ties
SELECT TOP 1 WITH ties Relationcode,
Companyname,
LatestBooking,
LatestContact
FROM NonBooking
ORDER BY Row_number() OVER(partition BY relationcode ORDER BY LatestBooking DESC)
If RELATIONCODE is unique in Relation table then here is one way using Outer Apply. Better approach in my opinion
SELECT r.[RELATIONCODE],
r.[COMPANYNAME],
oa.[BOOKINGDATE],
oa.[DATE]
FROM [fms].[dbo].[Relation] r
OUTER apply (SELECT TOP 1 b.[BOOKINGDATE],
c.[DATE]
FROM [fms].[dbo].[Booking] b
LEFT OUTER JOIN [fms].[dbo].[Communication] c
ON c.[RELATIONCODE] = r.[RELATIONCODE]
WHERE b.[BOOKINGDATE] < Dateadd(month, -2, Getdate())
AND b.[BOOKINGDATE] > Dateadd(year, -1, Getdate())
AND b.[RELATIONCODE] = r.[RELATIONCODE]
ORDER BY b.[BOOKINGDATE] DESC) oa

TSQL how to use if else in Where clause

I want to create a report, the report will have parameter for the user to select
-IsApprovedDate
-IsCatcheDate
I would like to know how to used the if else in the where clause.
Example if the user selects IsApprovedDate the report will lookup based on approved Date else will lookup based on catch date. In my query I will get top10 fish size base on award order weight here is my query.
;WITH CTE AS
(
select Rank() OVER (PARTITION BY c.trophyCatchCertificateTypeId order by c.catchWeight desc ) as rnk
,c.id,c.customerId, Cust.firstName + ' '+Cust.lastName as CustomerName
,CAST(CONVERT(varchar(10),catchWeightPoundsComponent)+'.'+CONVERT(varchar(10),catchWeightOuncesComponent) as numeric(6,2) ) WLBS
,c.catchGirth,c.catchLength,ct.description county
,t.description award--
,c.trophyCatchCertificateTypeId
,s.specificSpecies--
,c.speciesId
from Catches c
INNER JOIN TrophyCatchCertificateTypes t on c.trophyCatchCertificateTypeId = t.id
INNER JOIN Species s on c.speciesId = s.id
INNER JOIN Counties ct on c.countyId = ct.id
INNER JOIN Customers Cust on c.customerId = cust.id
Where c.bigCatchCertificateTypeId is not null
and c.catchStatusId =1
and c.speciesId =1 and c.isTrophyCatch =1
and c.catchDate >= #startDay and c.catchDate<=#endDay
)
Select * from CTE c1
Where rnk <=10
Just use conditional logic for this:
where . . . and
((#IsApprovedDate = 1 and c.ApprovedDate >= #startDay and c.ApprovedDate <= #endDay) or
(#IsCatchDate = 1 and c.catchDate >= #startDay and c.catchDate <= #endDay)
)
EDIT:
I would actually write this as:
where . . . and
((#IsApprovedDate = 1 and c.ApprovedDate >= #startDay and c.ApprovedDate < dateadd(day, 1 #endDay) or
(#IsCatchDate = 1 and c.catchDate >= #startDay and c.catchDate < dateadd(day, 1, #endDay))
)
This is a safer construct because it work when the date values have times and when they do not.
Performance will be much better if you build the WHERE clause dynamically in your code and then execute it.