Query to do DATEDIFF between fields on different rows - sql

I need to add a DATEDIFF to a query that gives me the hours between the current row's field, and the previous row's same field.
EDIT: {Should I ORDER the entire query by ROUTED_DTM DESC, as well as making the ORDER BY in the DATEDIFF DESC?
On one row I have a ROUTED_DTM of '2019-05-07 15:36:13.000', the row above has a ROUTED_DTM of '2019-05-01 14:19:52.000'. I would expect AGE_IN_ROLE_DAY, AGE_IN_ROLE_HR, AGE_IN_ROLE_MIN, AGE_IN_ROLE_SEC to be 6, 1, 16, and 21 (in order). However, I get 0, 0, 0, -2.}
SELECT c.ID,
c.PAID_DT,
DATEDIFF(dd,
CASE WHEN c.ID_ADJ_FROM = '' THEN c.RECD_DT ELSE c.INPUT_DT END,
CASE WHEN c.PAID_DT = '1/1/1753' THEN CONVERT(DATE,GETDATE()) ELSE c.PAID_DT END) + 1 AS DAYS_OLD
DATEDIFF(dd, h.ROUTED_DTM, LAG(h.ROUTED_DTM) OVER (ORDER BY h.ROUTED_DTM DESC)) AS AGE_IN_ROLE_DAY,
DATEDIFF(hh, h.ROUTED_DTM, LAG(h.ROUTED_DTM) OVER (ORDER BY h.ROUTED_DTM DESC)) AS AGE_IN_ROLE_HR,
DATEDIFF(MM, h.ROUTED_DTM, LAG(h.ROUTED_DTM) OVER (ORDER BY h.ROUTED_DTM DESC)) AS AGE_IN_ROLE_MIN,
DATEDIFF(ss, h.ROUTED_DTM, LAG(h.ROUTED_DTM) OVER (ORDER BY h.ROUTED_DTM DESC)) AS AGE_IN_ROLE_SEC,
h.QUEUE_ID,
h.QUEUE_DESC,
h.ROLE_ID,
h.ROLE_DESC,
h.ROUTED_DTM
FROM table1 c
LEFT JOIN table2 h
ON h.ID = c.ID
LEFT JOIN table 3 q
ON q.QUEUE_ID = h.QUEUE_ID
LEFT JOIN table4 r
ON r.ROLE_ID = h.ROLE_ID
ORDER BY c.ID, h.ROUTED_DTM DESC
I want to add a DATEDIFF(s) before the h.QUEUE_ID column that gives the difference between the current row's h.ROUTED_DTM, and the previous row's h.ROUTED_DTM
Currently, the query returns the correct results, however, I am not sure how to add the new DATEDIFF to each row.

You can use lag():
datediff(day, routed_dtm, lag(routed_dtm) over (order by routed_dtm))
You might also want partition by c.id in the window clause.

Related

Other alternatives to achieve LIMIT in SQL

I have created an SQL query to get certain data with LIMIT so I can use it in datatable. It has 76288 rows.
SELECT TransDate, AgentName, OfficeCode, year, ControlNumber,
ContainerNumber, BookingNumber, SealNumber, VesselName, ShippingLine, ShippingDate
FROM (
SELECT a.TransDate, a.AgentName, a.OfficeCode, DATEPART(YEAR, a.TransDate) AS year,
a.ControlNumber, b.ContainerNumber, b.BookingNumber,
b.SealNumber, b.VesselName, b.ShippingLine, b.ShippingDate,
ROW_NUMBER() OVER (ORDER BY a.TransDate) R
FROM Cargo_Transactions a
JOIN Cargo_Vessels b ON a.ControlNumber = b.ControlNumber
LEFT OUTER JOIN [Routes] c ON a.RouteID = c.RouteID
WHERE
a.TransDate IS NOT NULL
AND a.TransDate <= GETDATE()
AND DATEPART(YEAR, a.TransDate) = '2018'
) as f WHERE R BETWEEN 0 and 100
ORDER BY TransDate ASC;
0 and 100 is inside a variable that changes when the pagination is clicked.
If it's for the first hundred pages, it loads okay. But when I click the last page, it breaks saying timeout exceeded. Also, when I use the search function of the datatable, it's not working the way it should.
Example: I searched for dino in the datatable, it will say it has 95 records but will only show 1 record since the query is only between 0 and 10.
SELECT TransDate, AgentName, OfficeCode, year, ControlNumber, ContainerNumber,
BookingNumber, SealNumber, VesselName, ShippingLine, ShippingDate
FROM (
SELECT a.TransDate, a.AgentName, a.OfficeCode, DATEPART(YEAR, a.TransDate) AS year,
a.ControlNumber, b.ContainerNumber, b.BookingNumber, b.SealNumber,
b.VesselName, b.ShippingLine, b.ShippingDate,
ROW_NUMBER() OVER (ORDER BY a.TransDate) R
FROM Cargo_Transactions a
JOIN Cargo_Vessels b ON a.ControlNumber = b.ControlNumber
LEFT OUTER JOIN [Routes] c ON a.RouteID = c.RouteID
WHERE
a.TransDate IS NOT NULL
AND a.TransDate <= GETDATE()
AND DATEPART(YEAR, a.TransDate) = '2018'
) as f WHERE R BETWEEN 0 and 10 AND AgentName LIKE '%dino%'
ORDER BY TransDate ASC;
I also tried TOP and EXCEPT but when I search for SELECT TOP 0... EXCEPT SELECT TOP 100... but it's only showing 9 rows.
UPDATE:
I was able to make it work by including the WHERE clause in the subquery. My only problem now is the ORDER BY. It only works in the current page shown which is data 1 - 10 but not for all the data.
Any alternatives? Your help is highly appreciated. Thanks!

How do I remove certain duplicates in a complex SQL query

I am writing a query and need it to Remove all duplicates of a.GenUserID but also keep the most recent login date ( that is b.LogDateTime) but this date must be older than 6 months. If there are later dates, they have to be removed.
I hope this makes sense.
SELECT DISTINCT
a.GenUserID,
c.DeletionDate,
b.LogDateTime,
(CASE c.Disabled WHEN 0 THEN 'NO' else 'YES - ARCHIVED' end)
FROM RioReport.dbo.GenUser a
LEFT JOIN dbo.GenUserArchive c on a.GenUserID = c.GenUserID
LEFT JOIN dbo.GenUserAccessHistory b on a.GenUserID = b.ExtraInfo
WHERE(a.Disabled=0 or c.Disabled=0)
AND c.DeletionDate IS NOT NULL
AND ((DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime IS NULL))
ORDER BY a.GenUserID, b.LogDateTime desc
You could add the row_number() information to your query, and wrap that query into an outer query that just takes the records with number 1 from that result:
select *
from (
select a.GenUserID,
c.DeletionDate,
b.LogDateTime,
case c.Disabled when 0 then 'NO' else 'YES - ARCHIVED' end as diabled,
row_number() over (partition by a.GenUserID
order by b.LogDateTime desc) as rn
from RioReport.dbo.GenUser a
inner join dbo.GenUserArchive c
on a.GenUserID = c.GenUserID
left join dbo.GenUserAccessHistory b
on a.GenUserID = b.ExtraInfo
where (a.Disabled=0 or c.Disabled=0)
and c.DeletionDate is not null
and (DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime is null)
)
where rn = 1
order by a.GenUserID
Note that you can turn the first left join into an inner join without any change to the result set, since you have a non-null check on one of its fields. inner join is then preferred, and might give a performance improvement.
If GenUserAccessHistory.LogDateTime is always non-null, then you can avoid the test or b.LogDateTime is null by moving the DateAdd(MM, -6, GetDate()) > b.LogDateTime condition to the appropriate join on clause.
The generated row number will be given in order of descending LogDateTime values, and restart from 1 for every different user.
Alternative without window functions
row_number() and other window functions are supported since SQL Server 2008. In comments you write you cannot use it. If that is the case, here is an alternative using a common table expression (supported since SQL Server 2005):
;with cte as (
select a.GenUserID,
c.DeletionDate,
b.LogDateTime,
case c.Disabled when 0 then 'NO' else 'YES - ARCHIVED' end as disabled,
from RioReport.dbo.GenUser a
inner join dbo.GenUserArchive c
on a.GenUserID = c.GenUserID
left join dbo.GenUserAccessHistory b
on a.GenUserID = b.ExtraInfo
where (a.Disabled=0 or c.Disabled=0)
and c.DeletionDate is not null
and (DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime is null)
)
select *
from cte main
where LogDateTime is null
or not exists (select 1
from cte sub
where sub.GenUserID = main.GenUserID
and sub.LogDateTime > main.LogDateTime)
order by GenUserID
Try with the below query.
;WITH CTE_Group
AS(
SELECT
ROW_NUMBER() OVER (PARTITION BY a.GenUserID ORDER BY b.LogDateTime DESC) as RNO,
a.GenUserID,
c.DeletionDate,
b.LogDateTime,
(CASE c.Disabled WHEN 0 THEN 'NO' else 'YES - ARCHIVED' end) IsArchived
FROM RioReport.dbo.GenUser a
LEFT JOIN dbo.GenUserArchive c on a.GenUserID = c.GenUserID
LEFT JOIN dbo.GenUserAccessHistory b on a.GenUserID = b.ExtraInfo
WHERE(a.Disabled=0 or c.Disabled=0)
AND c.DeletionDate IS NOT NULL
AND ((DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime IS NULL)))
SELECT GenUserID,
DeletionDate,
LogDateTime,
IsArchived
FROM WITH_CTE_Group
WHERE RNO=1
Use cte and window function
;with ctr as (
select a.GenUserID, a.DeletionDate, a.LogDateTime
row_number()over(partition by a.GenUserID order by b.LogDateTime desc) rnk
from RioReport.dbo.GenUser a )
select a.GenUserID, a.DeletionDate, a.LogDateTime,
CASE WHEN DATEDIFF(mm,LogDateTime,getdate())<6 THEN 'NO' else 'YES - ARCHIVED' end)
from ctr a where a.rnk=1

How to compute average time required to post a comment on stackexchange using SQL?

I am looking for a SQL query that computes the average time to comment (measured for every month).
I was able to write a query that measures the average time between the original post datetime and the comment datetime but still this is not correct as the time should be measured between the current comment and the previous one, as they are related most of the time.
select
dateadd(month, datediff(month, 0, Comments.creationdate),0) [Date],
AVG(CAST(DATEDIFF(hour, Posts.CreationDate, Comments.creationdate ) AS BigInt)) [DelayHours]
from comments
INNER JOIN posts ON Comments.PostId = Posts.Id
GROUP BY
dateadd(month, datediff(month, 0, Comments.creationdate),0)
ORDER BY Date
I think something like this should work. Sorry, I cannot test it at the moment; I apologize in case I made a misprint.
WITH cte1 AS
(
SELECT c.PostId, c.creationdate,
ROW_NUMBER() OVER (PARTITION BY c.PostId ORDER BY c.creationdate) AS rn
FROM comments c
)
SELECT dateadd(month, datediff(month, 0, a.creationdate),0) [Date],
AVG(diff_hr) AS avg_diff
FROM
(
SELECT a1.PostId, a1.creationdate,
CASE
WHEN a1.rn = 1 THEN
CAST(DATEDIFF(hour,p.creationdate,a1.creationdate) AS BIGINT)
ELSE
CAST(DATEDIFF(hour,a2.creationdate,a1.creationdate) AS BIGINT)
END AS diff_hr
FROM cte1 a1
INNER JOIN posts p ON (p.Id = a1.PostId)
LEFT JOIN cte1 a2 ON (a2.PostId = a1.PostId AND a2.rn = a1.rn-1)
)a
GROUP BY dateadd(month, datediff(month, 0, a.creationdate),0)
Update
For SQLServer 2012 LAG will simplify the solution... I noticed comment about version too late .
Update 2 Misprints fixed (missed FROM clause and p.PostId changed to p.Id to match table definition)

How to output only one max value from this query in SQL?

Yesterday Thomas helped me a lot by providing exactly the query I wanted. And now I need a variant of it, and hopes someone can help me out.
I want it to output only one row, namely a max value - but it has to build on the algorithm in the following query:
WITH Calendar AS (SELECT CAST(#StartDate AS datetime) AS Date
UNION ALL
SELECT DATEADD(d, 1, Date) AS Expr1
FROM Calendar AS Calendar_1
WHERE (DATEADD(d, 1, Date) < #EndDate))
SELECT C.Date, C2.Country, COALESCE (SUM(R.[Amount of people per day needed]), 0) AS [Allocated testers]
FROM Calendar AS C CROSS JOIN
Country AS C2 LEFT OUTER JOIN
Requests AS R ON C.Date BETWEEN R.[Start date] AND R.[End date] AND R.CountryID = C2.CountryID
WHERE (C2.Country = #Country)
GROUP BY C.Date, C2.Country OPTION (MAXRECURSION 0)
The output from above will be like:
Date Country Allocated testers
06/01/2010 Chile 3
06/02/2010 Chile 4
06/03/2010 Chile 0
06/04/2010 Chile 0
06/05/2010 Chile 19
but what I need right now is
Allocated testers
19
that is - only one column - one row - the max value itself... (for the (via parameters (that already exists)) selected period of dates and country)
use order and limit
ORDER BY 'people needed DESC' LIMIT 1
EDITED
as LIMIT is not exist in sql
use ORDER BY and TOP
select TOP 1 .... ORDER BY 'people needed' DESC
WITH Calendar
AS (
SELECT
CAST(#StartDate AS datetime) AS Date
UNION ALL
SELECT
DATEADD(d, 1, Date) AS Expr1
FROM
Calendar AS Calendar_1
WHERE
( DATEADD(d, 1, Date) < #EndDate )
)
SELECT TOP 1 *
FROM
(
SELECT
C.Date
,C2.Country
,COALESCE(SUM(R.[Amount of people per day needed]), 0) AS [Allocated testers]
FROM
Calendar AS C
CROSS JOIN Country AS C2
LEFT OUTER JOIN Requests AS R
ON C.Date BETWEEN R.[Start date] AND R.[End date]
AND R.CountryID = C2.CountryID
WHERE
( C2.Country = #Country )
GROUP BY
C.Date
,C2.Country
OPTION
( MAXRECURSION 0 )
) lst
ORDER BY lst.[Allocated testers] DESC
Full example following the discussion in #Salil answer..
WITH Calendar AS (SELECT CAST(#StartDate AS datetime) AS Date
UNION ALL
SELECT DATEADD(d, 1, Date) AS Expr1
FROM Calendar AS Calendar_1
WHERE (DATEADD(d, 1, Date) < #EndDate))
SELECT TOP 1 C.Date, C2.Country, COALESCE (SUM(R.[Amount of people per day needed]), 0) AS [Allocated testers]
FROM Calendar AS C CROSS JOIN
Country AS C2 LEFT OUTER JOIN
Requests AS R ON C.Date BETWEEN R.[Start date] AND R.[End date] AND R.CountryID = C2.CountryID
WHERE (C2.Country = #Country)
GROUP BY C.Date, C2.Country
ORDER BY 3 DESC
OPTION (MAXRECURSION 0)
the ORDER BY 3 means order by the 3rd field in the SELECT statement.. so if you remove the first two fields, change this accordingly..

SQL subquery question

I have the following SQL
SELECT
Seq.UserSessionSequenceID,
Usr.SessionGuid,
Usr.UserSessionID,
Usr.SiteID,
Seq.Timestamp,
Seq.UrlTitle,
Seq.Url
FROM
tblUserSession Usr
INNER JOIN
tblUserSessionSequence Seq ON Usr.UserSessionID = Seq.UserSessionID
WHERE
(Usr.Timestamp > DATEADD(mi, -45, GETDATE())) AND (Usr.SiteID = 15)
ORDER BY Usr.Timestamp DESC
Pretty simple stuff. There are by nature multiple UserSessionIDs rows in tblUserSessionSequence. I ONLY want to return the latest (top 1) row with unique UserSessionID. How do I do that?
You can use the windowing function ROW_NUMBER to number the rows for each user and select only those rows that have row number 1.
SELECT
UserSessionSequenceID,
SessionGuid,
UserSessionID,
SiteID,
Timestamp,
UrlTitle,
Url
FROM (
SELECT
Seq.UserSessionSequenceID,
Usr.SessionGuid,
Usr.UserSessionID,
Usr.SiteID,
Usr.Timestamp AS UsrTimestamp,
Seq.Timestamp,
Seq.UrlTitle,
Seq.Url,
ROW_NUMBER() OVER (PARTITION BY Usr.UserSessionID
ORDER BY Seq.UserSessionSequenceID DESC) AS rn
FROM
tblUserSession Usr
INNER JOIN
tblUserSessionSequence Seq ON Usr.UserSessionID = Seq.UserSessionID
WHERE
(Usr.Timestamp > DATEADD(mi, -45, GETDATE())) AND (Usr.SiteID = 15)
) T1
WHERE rn = 1
ORDER BY UsrTimestamp DESC
If you're looking to return only a single row in your query (i.e., the ID with the latest timestamp), just change
SELECT
to
SELECT TOP 1
If you're looking to obtain a single row for each UserSessionID, but you want to ensure that you get the one with the latest TimeStamp, that's slightly more complex.
You could do something like this:
SELECT
Seq.UserSessionSequenceID,
Usr.SessionGuid,
Usr.UserSessionID,
Usr.SiteID,
Seq.Timestamp,
Seq.UrlTitle,
Seq.Url
FROM
tblUserSession Usr
INNER JOIN
(SELECT
UserSessionSequenceID,
UserSessionID,
Timestamp,
UrlTitle,
Url,
ROW_NUMBER() over (PARTITION BY UserSessionID ORDER BY UserSessionSequenceID) AS nbr
FROM tblUserSessionSequence) Seq ON Usr.UserSessionID = Seq.UserSessionID AND Seq.nbr = 0
WHERE
(Usr.Timestamp > DATEADD(mi, -45, GETDATE())) AND (Usr.SiteID = 15)
ORDER BY Usr.Timestamp DESC