TSQL OVER (PARTITION BY ... ) - sql

For my next trick, I would like to select only the most recent event for each client. Instead of four events for 000017 I want one.
OK c_id e_date e_ser e_att e_recip Age c_cm e_staff rn
--> 000017 2013-04-02 00:00:00.000 122 1 1 36 90510 90510 15
--> 000017 2013-02-26 00:00:00.000 122 1 1 36 90510 90510 20
--> 000017 2013-02-12 00:00:00.000 122 1 1 36 90510 90510 24
--> 000017 2013-01-29 00:00:00.000 122 1 1 36 90510 90510 27
--> 000188 2012-11-02 00:00:00.000 160 1 1 31 1289 1289 44
--> 001713 2013-10-01 00:00:00.000 142 1 1 26 2539 2539 1
--> 002531 2013-07-12 00:00:00.000 190 1 1 61 1689 1689 21
--> 002531 2013-06-14 00:00:00.000 190 1 1 61 1689 1689 30
--> 002531 2013-06-07 00:00:00.000 190 1 1 61 1689 1689 31
--> 002531 2013-05-28 00:00:00.000 122 1 1 61 1689 1689 33
Here is the query that got me to this stage (perhaps you have some suggestions to improve this as well, the extra nested query creating t2 table is probably excessive.) Thank you all!!!
SELECT TOP(10)*
FROM (
SELECT *
FROM (
SELECT (SELECT CASE WHEN
(e_att IN (1,2)
AND e_date > DATEADD(month, -12, getdate())
AND e_ser NOT IN (100,115)
AND e_recip NOT IN ('2','7')
AND (( (e_recip = '3') AND (DATEDIFF(Year, c_bd, GetDate())>10) ) OR (e_recip <> '3') )
AND c_cm = e_staff)
THEN '-->'
WHEN 1=1 THEN ''
END
) AS 'OK'
,c_id, e_date, e_ser, e_att, e_recip, DATEDIFF(Year, c_bd, GetDate()) AS 'Age', c_cm, e_staff
,row_number() OVER (PARTITION BY c_id ORDER BY e_date DESC) rn
FROM events INNER JOIN client ON e_case_no = c_id
LEFT OUTER JOIN doc ON doc.doc_dbid = client.c_id
WHERE client.c_id IN ( /* confidential query */ )
AND e_date > DATEADD(month, -12, getdate())
AND e_ser BETWEEN 11 AND 1000
GROUP BY c_id, e_date, e_ser, e_att, e_recip, c_bd, c_cm, e_staff
) t1
) t2
WHERE OK = '-->'
ORDER BY c_id, e_date DESC

It looks like the following produces the row number, sorted by date, per client:
,row_number() OVER (PARTITION BY c_id ORDER BY e_date DESC) rn
So adding where rn=1 should yield the most recent event per client:
) t1
WHERE rn = 1
) t2

Here is some improvements to your original query:
SELECT TOP(10) *
FROM (
SELECT '-->' AS 'OK' -- always this see where.
,c_id, e_date, e_ser, e_att, e_recip, DATEDIFF(Year, c_bd, GetDate()) AS 'Age', c_cm, e_staff
,row_number() OVER (PARTITION BY c_id ORDER BY e_date DESC) rn
FROM events INNER JOIN client ON e_case_no = c_id
LEFT OUTER JOIN doc ON doc.doc_dbid = client.c_id
WHERE client.c_id IN ( /* confidential query */ )
-- this part was in case and then filtered for later, if we put it in where now more efficient
(e_att IN (1,2) AND e_date > DATEADD(month, -12, getdate())
AND e_ser NOT IN (100,115)
AND (( (e_recip = '3') AND DATEDIFF(Year, c_bd, GetDate()>10) ) OR e_recip NOT IN ('2', '3', '7') )
AND c_cm = e_staff)
AND e_date > DATEADD(month, -12, getdate())
AND e_ser BETWEEN 11 AND 1000
GROUP BY c_id, e_date, e_ser, e_att, e_recip, c_bd, c_cm, e_staff
) t2
ORDER BY c_id, e_date DESC
Besides removing some un-needed parens, if you move the stuff from the CASE statement to a where you don't need to filter on it in the outer query and this makes it simpler.
Add in the row_number statement from McGarnagle's answer and you should get the results you want.

Related

How to join 3 records as 1 record base on a same date?

This are some sample queries I wrote:
SELECT
CAST(datecolumn AS DATE) AS DateColumn,
COUNT(*) AS count
FROM
dbo.myTableName
WHERE
status = 'stage1'
GROUP BY CAST(datecolumn AS DATE) ORDER BY DateColumn DESC;
SELECT
CAST(datecolumn AS DATE) AS DateColumn,
COUNT(*) AS count
FROM
dbo.myTableName
WHERE
status = 'stage2'
GROUP BY CAST(datecolumn AS DATE) ORDER BY DateColumn DESC;
This is the output from the 1st query:
DateColumn count
------------------
2022-05-26 23
2022-05-25 51
2022-05-24 39
2022-05-23 55
2022-05-22 27
2022-05-21 90
and this is the output from the 2nd query:
DateColumn count
-----------------
2022-05-26 31
2022-05-25 67
2022-05-24 38
2022-05-23 54
2022-05-22 28
I want to only have a single query that will output it like this
DateColumn stage1count stage2count
-----------------------------------
2022-05-26 23 31
2022-05-25 51 67
2022-05-24 39 38
2022-05-23 55 54
2022-05-22 27 28
Thanks for answer
Can you try this:
select cast(datecolumn as DATE) as DateColumn,
sum(case when status = 'stage1' then 1 else 0 end) as stage1count,
sum(case when status = 'stage2' then 1 else 0 end) as stage2count
from dbo.myTableName
where status in ('stage1', 'stage2')
group by cast(datecolumn as DATE)
order by DateColumn DESC
Another note: Most SQL systems treat datecolumn and DateColumn the same, so it is somewhat ambiguous which it is actually using in the group by and order by clauses. I think the order by is using the casted value in the select list, and the groupby might be using the base column (uncasted) but I'm not sure about that. If you want to avoid the ambiguity, you can use a delimited identifier "DateColumn" instead.
#hewszz, you mention that you also need this for the case where you have two tables. This might do the job if you have two tables:
select t1.DateColumn, stage1count, stage2count
from (select cast(datecolumn as DATE) as DateColumn,
count(*) as stage1count
from dbo.myTableName1
where status = 'stage1'
group by cast(datecolumn as DATE)) t1
full outer join
(select cast(datecolumn as DATE) as DateColumn,
count(*) as stage2count
from dbo.myTableName2
where status = 'stage2'
group by cast(datecolumn as DATE)) t2
on t1.DateColumn = t2.DateColumn
order by t1.DateColumn DESC
By grouping each table separately we make sure that DateColumn is unique on each side, so each row will join with at most one row from the other grouped query. By using a full outer join we make sure no rows get lost when we have only a stage1 or a stage2 record for a given day.

SQL: selecting rows where column value changed last time

I need to get the latest date where number is changed I have this SQL statement
Select
a.group, a.date a.number
From
xx.dbo.list a
Where
a.group in ('10, '10NC', '210')
And a.date >= '2018-06-01'
And a.number > 0
And a. number <> (Select Top 1 b.number
From xxx.dbo.list b
Where b.group = a.group
And b.date >= '2018-06-01'
And b.number > 0
And b.date < a.date
Order by b.date desc)
order by a.date desc
I have a table that looks like this
Group date Number
--------------------------
10 2018-02-06 4
10 2018-04-06 4
10 2018-06-12 4
10NC 2018-02-06 68
10NC 2018-04-06 35
10NC 2018-06-11 35
10NC 2018-06-12 68
10NC 2018-06-13 35
210 2018-06-02 94
210 2018-06-04 100
210 2018-06-06 100
210 2018-06-07 93
I get this output now, but I only want to get the rows with X
Group date Number
------------------------------
10NC 2018-06-12 68
10NC 2018-06-13 35 X
210 2018-06-04 100
210 2018-06-07 93 X
Can anyone help?
You would use lag():
select a.*
from (select a.group, a.date, a.number, lag(a.number) over (partition by group order by date) as prev_number
From xx.dbo.list a
where a.group in ('10', '10NC', '210') And
a.date >= '2018-06-01' And
a.number > 0
) a
where prev_number <> number;
Is this what is Expected?
DECLARE #List TABLE ([Group] VARCHAR(100), [Date] DATE, Number INT)
INSERT INTO #List
SELECT '10','2018-02-06',4
UNION ALL
SELECT '10','2018-04-06',4
UNION ALL
SELECT '10','2018-06-12',4
UNION ALL
SELECT '10NC','2018-02-06',68
UNION ALL
SELECT '10NC','2018-04-06',35
UNION ALL
SELECT '10NC','2018-06-11',35
UNION ALL
SELECT '10NC','2018-06-12',68
UNION ALL
SELECT '10NC','2018-06-13',35
UNION ALL
SELECT '210','2018-06-02',94
UNION ALL
SELECT '210','2018-06-04',100
UNION ALL
SELECT '210','2018-06-06',100
UNION ALL
SELECT '210','2018-06-07',93
;WITH CTE AS
(
SELECT
*
,RN = ROW_NUMBER() OVER (Partition by [Group] ORDER BY [DATE] DESC)
FROM #List
WHERE
[Date] >= '2018-06-01'
AND [Group] in ('10', '10NC', '210')
And Number > 0
)
SELECT * FROM CTE WHERE RN = 1
Note: I am posting it directly in answer as i don't have enough reputation to ask questions in comments.

MS SQL get aggregate datetime difference by status

I have below table in sql.
======================================================
UnitID Status DateTime Value
======================================================
101 A 01/12/2017 00:02:10 10
101 A 01/12/2017 00:02:40 25
101 A 01/12/2017 00:03:20 18
101 B 01/12/2017 00:03:55 30
101 B 01/12/2017 00:04:05 10
101 B 01/12/2017 00:04:30 20
101 B 01/12/2017 00:04:50 10
101 A 01/12/2017 00:05:00 28
101 A 01/12/2017 00:05:50 18
101 A 01/12/2017 00:06:20 18
102 A 01/12/2017 00:02:10 10
102 A 01/12/2017 00:02:40 25
102 A 01/12/2017 00:03:20 18
102 B 01/12/2017 00:03:55 30
102 B 01/12/2017 00:04:05 10
102 B 01/12/2017 00:04:30 20
102 B 01/12/2017 00:04:50 10
102 A 01/12/2017 00:05:00 28
102 A 01/12/2017 00:05:50 18
102 A 01/12/2017 00:06:20 18
From this table i need below mention output.
===========================================
UnitID StatusA StatusB MaxValue
===========================================
101 02:30 00:55 30
102 02:30 00:55 30
what i need is the total time difference by status. so how could i achieve this in mssql query. so here 02:30 is time duration for status "A" in the table.
Thank you in advanced.
As far as I know you cannot have status in different columns, only by row.
SELECT [UnitID], [Status], MAX([DateTime]) - MIN([DateTime]), MAX([Value])
FROM [theTable]
GROUP BY [UnitID], [Status]
Output would be like
101 A 02:30 30
101 B 00:55 30
102 A 02:30 30
102 B 00:55 30
If you have fixed states of A and B you can go messy and do this:
SELECT UnitID, A, B, MaxValue
FROM
(
SELECT [UnitID], MAX([DateTime]) - MIN([DateTime]) AS A, null AS B, MAX([Value]) AS MaxValue
FROM [theTable]
WHERE Status = 'A'
GROUP BY [UnitID]
UNION ALL
SELECT [UnitID], null, MAX([DateTime]) - MIN([DateTime]), MAX([Value])
FROM [theTable]
WHERE Status = 'B'
GROUP BY [UnitID]
) x
You can do what you need with the following query. I tried to separate each step on different CTE's so you can see step by step how to get to your result. LAG will retrieve the previous row value (spliting by the PARTITION BY columns and ordering by the ORDER BY).
;WITH LaggedValues AS
(
SELECT
M.UnitID,
M.Status,
M.DateTime,
LaggedDateTime = LAG(M.DateTime) OVER (PARTITION BY M.UnitID ORDER BY M.DateTime ASC),
LaggedStatus = LAG(M.Status) OVER (PARTITION BY M.UnitID ORDER BY M.DateTime ASC)
FROM
Measures AS M
),
TimeDifferences AS
(
SELECT
T.*,
SecondDifference = CASE
WHEN T.Status = T.LaggedStatus THEN DATEDIFF(SECOND, T.LaggedDateTime, T.DateTime) END
FROM
LaggedValues AS T
),
TotalsByUnitAndStatus AS
(
SELECT
T.UnitID,
T.Status,
SecondDifference = SUM(T.SecondDifference)
FROM
TimeDifferences AS T
GROUP BY
T.UnitID,
T.Status
),
TotalsByUnit AS -- Conditional aggregation (alternative to PIVOT)
(
SELECT
T.UnitID,
StatusA = MAX(CASE WHEN T.Status = 'A' THEN T.SecondDifference END),
StatusB = MAX(CASE WHEN T.Status = 'B' THEN T.SecondDifference END)
FROM
TotalsByUnitAndStatus AS T
GROUP BY
T.UnitID
)
SELECT
T.UnitID,
StatusA = CONVERT(VARCHAR(10), T.StatusA / 60) + ':' + CONVERT(VARCHAR(10), T.StatusA % 60),
StatusB = CONVERT(VARCHAR(10), T.StatusB / 60) + ':' + CONVERT(VARCHAR(10), T.StatusB % 60)
FROM
TotalsByUnit AS T
You can get the difference for each group:
select unitid, status, min(datetime) as mindt, max(datetime) as maxdt, max(value) as maxvalue
from (select t.*,
row_number() over (partition by unitid order by datetime) as seqnum,
row_number() over (partition by unitid, status order by datetime) as seqnum_s
from t
) t
group by unitid, status, (seqnum - seqnum_s);
This solves the "groups-and-islands" problem. Now you can get the information you want using conditional aggregation:
with t as (
select unitid, status, min(datetime) as mindt, max(datetime) as maxdt, max(value) as maxvalue
from (select t.*,
row_number() over (partition by unitid order by datetime) as seqnum,
row_number() over (partition by unitid, status order by datetime) as seqnum_s
from t
) t
group by unitid, status, (seqnum - seqnum_s)
)
select unitid,
sum(case when status = 'A' then datediff(minute, mindt, maxdt) end) as a_minutes,
sum(case when status = 'b' then datediff(minute, mindt, maxdt) end) as a_minutes,
max(maxvalue)
from t
group by unitid;
I'll leave it up to you to convert the minutes back to times.

How to get all in and out time for an particular employee?

My table is as below:
id time_stamp Access Type
1001 2017-09-05 09:35:00 IN
1002 2017-09-05 11:00:00 IN
1001 2017-09-05 12:00:00 OUT
1002 2017-09-05 12:25:00 OUT
1001 2017-09-05 13:00:00 IN
1002 2017-09-05 14:00:00 IN
1001 2017-09-05 17:00:00 OUT
1002 2017-09-05 18:00:00 OUT
I have tried this query below:
SELECT ROW_NUMBER() OVER (
ORDER BY A.emp_reader_id ASC
) AS SNo
,B.emp_code
,B.emp_name
,CASE
WHEN F.event_entry_name = 'IN'
THEN A.DT
END AS in_time
,CASE
WHEN F.event_entry_name = 'OUT'
THEN A.DT
END AS out_time
,cast(left(CONVERT(TIME, a.DT), 5) AS VARCHAR) AS 'time'
,isnull(B.areaname, 'OAE6080036073000006') AS areaname
,C.dept_name
,b.emp_reader_id
,isnull(c.dept_name, '') AS group_name
,CONVERT(CHAR(11), '2017/12/30', 103) AS StartDate
,CONVERT(CHAR(11), '2018/01/11', 103) AS ToDate
,0 AS emp_card_no
FROM dbo.trnevents AS A
LEFT OUTER JOIN dbo.employee AS B ON A.emp_reader_id = B.emp_reader_id
LEFT OUTER JOIN dbo.departments AS C ON B.dept_id = C.dept_id
LEFT OUTER JOIN dbo.DevicePersonnelarea AS E ON A.POINTID = E.areaid
LEFT OUTER JOIN dbo.Event_entry AS F ON A.EVENTID = F.event_entry_id
ORDER BY A.emp_reader_id ASC
It works but it takes like below. Sometime have same in event and out event :
SNo emp_code emp_name in_time out_time time areaname dept_name emp_reader_id group_name StartDate ToDate emp_card_no
1 102 Ihsan Titi NULL 2017-12-30 12:16:26.000 12:16 Dubai Sales 102 Sales 2017/12/30 2018/01/11 0
2 102 Ihsan Titi NULL 2017-12-30 12:16:27.000 12:16 Dubai Sales 102 Sales 2017/12/30 2018/01/11 0
3 102 Ihsan Titi 2017-12-30 12:44:26.000 NULL 12:44 Dubai Sales 102 Sales 2017/12/30 2018/01/11 0
4 102 Ihsan Titi 2017-12-30 16:27:48.000 NULL 16:27 Dubai Sales 102 Sales 2017/12/30 2018/01/11 0
Expected output:
SNo emp_code emp_name in_time out_time time areaname dept_name emp_reader_id group_name StartDate ToDate emp_card_no
1 102 Ihsan Titi 2017-12-30 12:16:26.000 2017-12-30 12:44:26.000 12:16 Dubai Sales 102 Sales 2017/12/30 2018/01/11 0
2 102 Ihsan Titi 2017-12-30 12:50:26.000 2017-12-30 16:27:48.000 12:16 Dubai Sales 102 Sales 2017/12/30 2018/01/11 0
kindly help i stuck here to get like this..
you can use this :
select A_In.emp_reader_id as empId,A_In.Belongs_to,A_In.DeviceSerialNumber,
DT as EntryTime,
(
select min(DT) as OutTime
from trnevents A_Out
where EVENTID like 'IN'
and A_Out.emp_reader_id = A_In.emp_reader_id
and A_Out.DT > A_In.DT and DATEDIFF(day,A_In.Dt,A_Out.DT)=0
) as ExitTime from trnevents A_In where EVENTID like 'OUT'
from trnevents A_In
The way I've approached it below is to say that if an event is the same type as the event before it then treat it as a "rogue".
Rogues always sit on their own, never paired with any other event.
All other events get paired such that IN is the first item and OUT is the second item.
Then I can group everything up to reduce pairs down to single rows.
WITH
rogue_check
AS
(
SELECT
CASE WHEN LAG(F.event_entry_name) OVER (PARTITION BY A.emp_reader_number ORDER BY A.DT) = F.event_entry_name THEN 1 ELSE 0 END AS is_rogue,
*
FROM
trnevents AS A
LEFT JOIN
EVent_entry AS F
ON F.event_entry_id = A.event_id
),
sorted AS
(
SELECT
ROW_NUMBER() OVER ( ORDER BY DT) AS event_sequence_id,
ROW_NUMBER() OVER (PARTITION BY emp_reader_number, is_rogue ORDER BY DT) AS employee_checked_event_sequence_id,
*
FROM
rogue_check
)
SELECT
MIN(event_sequence_id) AS unique_id,
emp_reader_number,
MAX(CASE WHEN event_entry_name = 'IN' THEN DT END) AS time_in,
MAX(CASE WHEN event_entry_name = 'OUT' THEN DT END) AS time_out
FROM
sorted
GROUP BY
emp_reader_number,
is_rogue,
employee_checked_event_sequence_id - CASE WHEN is_rogue = 1 OR event_entry_name = 'IN' THEN 0 ELSE 1 END
ORDER BY
emp_reader_number,
unique_id
;
Example Schema:
CREATE TABLE trnevents (
emp_reader_number INT,
DT DATETIME,
event_id INT
);
CREATE TABLE Event_entry (
event_entry_id INT,
event_entry_name NVARCHAR(32)
);
Example Data:
INSERT INTO Event_entry VALUES (0, N'IN'), (1, N'OUT');
INSERT INTO trnevents VALUES
(1, '2017-01-01 08:00', 0),
(1, '2017-01-01 08:01', 0),
(1, '2017-01-01 12:00', 1),
(1, '2017-01-01 13:00', 0),
(1, '2017-01-01 17:00', 1),
(1, '2017-01-01 17:01', 1)
;
Example Results:
unique_id emp_reader_number time_in time_out
1 1 01/01/2017 08:00:00 01/01/2017 12:00:00
2 1 01/01/2017 08:01:00 null
4 1 01/01/2017 13:00:00 01/01/2017 17:00:00
6 1 null 01/01/2017 17:01:00
The GROUP BY turned out a bit more fiddly than I anticipated on the train and so may cause an expensive SORT in the execution plan for large data sets. I'll also think about an alternative shortly.
Here is a demo with some simple dummy data demonstrating that it works for those cases at least. (Feel free to update it with other cases if they demonstrate any problems)
http://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=d06680d8ed374666760cdc67182aaacb
You can use a PIVOT
select id, [in], out
from
( select
id, time_stamp, accessType,
(ROW_NUMBER() over (partition by id order by time_stamp) -1 )/ 2 rn
from yourtable ) src
pivot
(min(time_stamp) for accessType in ([in],[out])) p
This assumes that each "in" is followed by an "out" and uses row_number to group those pairs of times.

How to get data with a must different conditions?

If i have tables with the following structure :
USERS :
USERID,NAME
TIMETB :
CHECKTIME,Sysdate,modified,USERID
If i have sample data like this :
USERS :
USERID NAME
434 moh
77 john
66 yara
TIMETB :
CHECKTIME USERID modified
2015-12-21 07:20:00.000 434 0
2015-12-21 08:39:00.000 434 2
2015-12-22 07:31:00.000 434 0
2015-12-21 06:55:00.000 77 0
2015-12-21 07:39:00.000 77 0
2015-12-25 07:11:00.000 66 0
2015-12-25 07:22:00.000 66 0
2015-12-25 07:50:00.000 66 2
2015-12-26 07:40:00.000 66 2
2015-12-26 07:21:00.000 66 2
Now i want to get the users who have two or more different transactions(modified) at the same date :
The result i expect is :
CHECKTIME USERID modified NAME
2015-12-21 07:20:00.000 434 0 moh
2015-12-21 08:39:00.000 434 2 moh
2015-12-25 07:11:00.000 66 0 yara
2015-12-25 07:22:00.000 66 0 yara
2015-12-25 07:50:00.000 66 2 yara
I write the following query but i get more than i expect i mean i get users who have transactions of the same (modified) !!.
SELECT a.CHECKTIME,
a.Sysdate,
(CASE WHEN a.modified = 0 THEN 'ADD' ELSE 'DELETE' END) AS modified,
b.BADGENUMBER,
b.name,
a.Emp_num AS Creator
FROM TIMETB a
INNER JOIN Users b ON a.USERID = b.USERID
WHERE YEAR(checktime) = 2015
AND MONTH(checktime) = 12
AND (
SELECT COUNT(*)
FROM TIMETB cc
WHERE cc.USERID = a.USERID
AND CONVERT(DATE, cc.CHECKTIME) = CONVERT(DATE, a.CHECKTIME)
AND cc.modified IN (0, 2)
) >= 2
AND a.modified IS NOT NULL
AND a.Emp_num IS NOT NULL
You use window functions for this:
select t.*
from (select t.*,
count(*) over (partition by userid, cast(checktime as date)) as cnt
from timetb t
) t
where cnt >= 2;
If you want the name, just join in the appropriate table.
EDIT:
If you want different values of a column, a simple way is to compare the min and max values:
select t.*
from (select t.*,
min(modified) over (partition by userid, cast(checktime as date)) as minm,
max(modified) over (partition by userid, cast(checktime as date)) as maxm
from timetb t
) t
where minm <> maxm;