SQL: First Occuerence and with value - sql

I have rarely use SQL, however for this task it may be the most suitable. I am looking to create a query that is able to detect the first occurrence of an incident for each subject.
Record:
------------------------------
personID | date | incident
------------------------------
1 20150901 F1
2 20150101 B2
3 20150301 C3
1 20150901 B2
3 20150401 R5
2 20150401 C3
1 20150701 F1
Wanted Result:
------------------------------
personID | date | incident
------------------------------
2 20150101 B2
3 20150301 C3
3 20150401 R5
2 20150401 C3
1 20150701 F1
1 20150901 B2
Simply: I am looking for the first (based on date) time the incident occurs for each personID, ignoring if the incident reoccurs.
Thanks
PS. Using SQL Server 2008

Using MIN should work for this:
select personId,incident,MIN(convert(date,date)) as date
from [table]
group by personId,incident

Try this:
;WITH
temp AS (
SELECT personID, date, incident,
rn = ROW_NUMBER() OVER (PARTITION BY personID, incident ORDER BY date)
FROM my_table
)
SELECT *
FROM temp
WHERE rn = 1

You can use row_number function to get the desired result.
Fiddle with sample data
select personid, date, incident
from
(
select *, row_number() over(partition by personid, incident order by date) as rn
from tablename
) x
where x.rn = 1;

Related

First value in DATE minus 30 days SQL

I have bunch of data out of which I'm showing ID, max date and it's corresponding values (user id, type, ...). Then I need to take MAX date for each ID, substract 30 days and show first date and it's corresponding values within this date period.
Example:
ID Date Name
1 01.05.2018 AAA
1 21.04.2018 CCC
1 05.04.2018 BBB
1 28.03.2018 AAA
expected:
ID max_date max_name previous_date previous_name
1 01.05.2018 AAA 05.04.2018 BBB
I have working solution using subselects, but as I have quite huge WHERE part, refresh takes ages.
SUBSELECT looks like that:
(SELECT MIN(N.name)
FROM t1 N
WHERE N.ID = T.ID
AND (N.date < MAX(T.date) AND N.date >= (MAX(T.date)-30))
AND (...)) AS PreviousName
How'd you write the select?
I'm using TSQL
Thanks
I can do this with 2 CTEs to build up the dates and names.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 (ID int, theDate date, theName varchar(10)) ;
INSERT INTO t1 (ID, theDate, theName)
VALUES
( 1,'2018-05-01','AAA' )
, ( 1,'2018-04-21','CCC' )
, ( 1,'2018-04-05','BBB' )
, ( 1,'2018-03-27','AAA' )
, ( 2,'2018-05-02','AAA' )
, ( 2,'2018-05-21','CCC' )
, ( 2,'2018-03-03','BBB' )
, ( 2,'2018-01-20','AAA' )
;
Main Query:
;WITH cte1 AS (
SELECT t1.ID, t1.theDate, t1.theName
, DATEADD(day,-30,t1.theDate) AS dMinus30
, ROW_NUMBER() OVER (PARTITION BY t1.ID ORDER BY t1.theDate DESC) AS rn
FROM t1
)
, cte2 AS (
SELECT c2.ID, c2.theDate, c2.theName
, ROW_NUMBER() OVER (PARTITION BY c2.ID ORDER BY c2.theDate) AS rn
, COUNT(*) OVER (PARTITION BY c2.ID) AS theCount
FROM cte1
INNER JOIN cte1 c2 ON cte1.ID = c2.ID
AND c2.theDate >= cte1.dMinus30
WHERE cte1.rn = 1
GROUP BY c2.ID, c2.theDate, c2.theName
)
SELECT cte1.ID, cte1.theDate AS max_date, cte1.theName AS max_name
, cte2.theDate AS previous_date, cte2.theName AS previous_name
, cte2.theCount
FROM cte1
INNER JOIN cte2 ON cte1.ID = cte2.ID
AND cte2.rn=1
WHERE cte1.rn = 1
Results:
| ID | max_date | max_name | previous_date | previous_name |
|----|------------|----------|---------------|---------------|
| 1 | 2018-05-01 | AAA | 2018-04-05 | BBB |
| 2 | 2018-05-21 | CCC | 2018-05-02 | AAA |
cte1 builds the list of max_date and max_name grouped by the ID and then using a ROW_NUMBER() window function to sort the groups by the dates to get the most recent date. cte2 joins back to this list to get all dates within the last 30 days of cte1's max date. Then it does essentially the same thing to get the last date. Then the outer query joins those two results together to get the columns needed while only selecting the most and least recent rows from each respectively.
I'm not sure how well it will scale with your data, but using the CTEs should optimize pretty well.
EDIT: For the additional requirement, I just added in another COUNT() window function to cte2.
I would do:
select id,
max(case when seqnum = 1 then date end) as max_date,
max(case when seqnum = 1 then name end) as max_name,
max(case when seqnum = 2 then date end) as prev_date,
max(case when seqnum = 2 then name end) as prev_name,
from (select e.*, row_number() over (partition by id order by date desc) as seqnum
from example e
) e
group by id;

How to group subtotals on the same row by date, by code

I couldn't find an equivalent question on here for this question. Apologies if this is a repeat
Basically I have a table with transactions. Each transaction has a code and a datetime stamp. I want to be able to create a SQL query so that the results look something like this
+------------+--------+--------+-------+--------+-------+--------+
| DATE | CODE1 | COUNT1 | CODE2 | COUNT2 | CODE3 | COUNT3 |
+------------+--------+--------+-------+--------+-------+--------+
| 2017-01-01 | George | 12 | John | 10 | Ringo | 114 |
+------------+--------+--------+-------+--------+-------+--------+
I currently have a query that I can pull the subtotals on individual lines, i.e:
SELECT CONVERT(mytime AS DATE), code, COUNT(*) FROM transactiontable
GROUP BY CONVERT(mytime AS DATE), code
ORDER BY CONVERT(mytime AS DATE), code
Would give me
DATE CODE COUNT
-----------------------------------
2017-01-01 George 12
2017-01-01 John 10
etc ...
I don't currently have a separate table for the codes, but I am considering it.
Thanks !
You also can use PIVOT for making this.
DECLARE #Table TABLE (DATE DATETIME, CODE VARCHAR(10), [COUNT] INT)
INSERT INTO #Table
VALUES
('2017-01-01','George',12),
('2017-01-01','John',10)
;WITH CTE AS
(
SELECT RN = ROW_NUMBER() OVER (ORDER BY DATE), * FROM #Table
)
SELECT * FROM
(SELECT DATE, CONCAT('CODE',RN) RN, CODE Value FROM CTE
UNION ALL
SELECT DATE, CONCAT('COUNT',RN) RN, CONVERT(VARCHAR,[COUNT]) Value FROM CTE
) SRC
PIVOT (MAX(Value) FOR RN IN ([CODE1],[COUNT1],[CODE2],[COUNT2])) PVT
Result:
DATE CODE1 COUNT1 CODE2 COUNT2
----------- ----------- ----------- -------- -------
2017-01-01 George 12 John 10
You can use window function row_number to form groups and use conditional aggregation to pivot:
select dt,
max(case when rn = 1 then code end) as code_1,
max(case when rn = 1 then cnt end) as code_1,
max(case when rn = 2 then code end) as code_2,
max(case when rn = 2 then cnt end) as code_2,
max(case when rn = 3 then code end) as code_3,
max(case when rn = 3 then cnt end) as code_3,
....
from (
select convert(date, mytime) as dt,
code,
count(*),
row_number() over (partition by convert(date, mytime) order by code) as rn
from transactiontable
group by convert(date, mytime), code
) t
group by dt
order by dt;

select top N records for each entity

I have a table like below -
ID | Reported Date | Device_ID
-------------------------------------------
1 | 2016-03-09 09:08:32.827 | 1
2 | 2016-03-08 09:08:32.827 | 1
3 | 2016-03-08 09:08:32.827 | 1
4 | 2016-03-10 09:08:32.827 | 2
5 | 2016-03-05 09:08:32.827 | 2
Now, i want a top 1 row based on date column for each device_ID
Expected Output
ID | Reported Date | Device_ID
-------------------------------------------
1 | 2016-03-09 09:08:32.827 | 1
4 | 2016-03-10 09:08:32.827 | 2
I am using SQL Server 2008 R2. i can go and write Stored Procedure to handle it but wanted do it with simple query.
****************EDIT**************************
Answer by 'Felix Pamittan' worked well but for 'N' just change it to
SELECT
Id, [Reported Date], Device_ID
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [ReportedDate] DESC)
FROM tbl
)t
WHERE Rn >= N
He had mentioned this in comment thought to add it to questions so that no body miss it.
Use ROW_NUMBER:
SELECT
Id, [Reported Date], Device_ID
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [ReportedDate] DESC)
FROM tbl
)t
WHERE Rn = 1
You can also try using CTE
With DeviceCTE AS
(SELECT *, ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [Reported Date] DESC) AS Num
FROM tblname)
SELECT Id, [Reported Date], Device_ID
From DeviceCTE
Where Num = 1
If you can't use an analytic function, e.g. because your application layer won't allow it, then you can try the following solution which uses a subquery to arrive at the answer:
SELECT t1.ID, t2.maxDate, t1.Device_ID
INNER JOIN
(
SELECT Device_ID, MAX([Reported Date]) AS maxDate
FROM yourTable
GROUP BY Device_ID
) t2
ON t1.Device_ID = t2.Device_ID AND t1.[Reported Date] = t2.maxDate
Select * from DEVICE_TABLE D
where [Reported Date] = (Select Max([Reported Date]) from DEVICE_TABLE where Device_ID = D.Device_ID)
should do the trick, assume that "top 1 row based on date column" means that you want to select the latest reported date of each Device_ID ?
As for your title, select top 5 rows of each Device_ID
Select * from DEVICE_TABLE D
where [Reported Date] in (Select top 5 [Reported Date] from DEVICE_TABLE D where Device_ID = D.Device_ID)
order by Device_ID, [Reported Date] desc
will give you the top 5 latest reports of each device id.
You may want to sort out the top 5 date if your data isn't in order...
Again with no analytic functions you can use CROSS APPLY :
DECLARE #tbl TABLE(Id INT,[Reported Date] DateTime , Device_ID INT)
INSERT INTO #tbl
VALUES
(1,'2016-03-09 09:08:32.827',1),
(2,'2016-03-08 09:08:32.827',1),
(3,'2016-03-08 09:08:32.827',1),
(4,'2016-03-10 09:08:32.827',2),
(5,'2016-03-05 09:08:32.827',2)
SELECT r.*
FROM ( SELECT DISTINCT Device_ID FROM #tbl ) d
CROSS APPLY ( SELECT TOP 1 *
FROM #tbl t
WHERE d.Device_ID = t.Device_ID ) r
Can be easily modified to support N records.
Credits go to wBob answering this question here

Double increment where 2nd increment reflects 1st in sql for encounter data

I am building healthcare 837 encounters and need to set increments on the HL segments.
C1 based on what is set on Criteria1 and C2 based on Criteria2.
C2 will never have the same number as C1 and vice versa.
C1 I was able to pull using row_number() over(order by (select Criteria1))
It's the C2 I am having a problem with.
C1 | C2 | Criteria1 | Criteria2
1 | 2 | ID1 | NID1
1 | 3 | ID1 | NID2
1 | 4 | ID1 | NID3
5 | 6 | ID2 | NID4
5 | 7 | ID2 | NID5
5 | 8 | ID2 | NID6
9 |10 | ID3 | NID7
Simplified query:
SELECT cm.Criteria1, cm.Criteria2, cj.C1
FROM [dbo].[TBL1] cm
JOIN (
SELECT cm.Criteria1,
row_number() over(order by (select Criteria1)) as C1
FROM [dbo].[TBL1] cm
GROUP BY cm.Criteria1) cj on cj.Criteria1 = cm.Criteria1
GROUP BY cm.Criteria1, cm.Criteria2, cj.C1 Order by cj.C1
This seems to work but I didn't check many edge cases (fun with windowing!):
with tbl1 as (
select 'ID1' as Criteria1, 'NID1' as Criteria2
union
select 'ID1', 'NID2'
union
select 'ID2', 'NID4'
union
select 'ID2', 'NID5'
union
select 'ID3', 'NID7'
)
select
rank() over (order by Criteria1) + DENSE_ranK() OVER (ORDER BY CRITERIA1) - 1 as C1,
rank() over (order by Criteria1) + row_number() over (partition by Criteria1 order by Criteria2) + DENSE_ranK() OVER (ORDER BY CRITERIA1) - 1 as C2,
Criteria1,
Criteria2
from
tbl1
To break it down a little:
Let's call each set of Criteria1 rows a "partition" as in SQL parlance.
The requirement is thus:
C1 is always equal to the number of rows in all the previous partitions + 1 for the current partition, plus the number of previous partitions.
C2 is always equal to the number of rows in all the previous partitions + 1 for the current partition, plus the number of previous partitions, plus the number of all the previous rows within the partition + 1 for the current row.
RANK() over (order by Criteria1) gives you the number of rows in all the previous partitions + 1.
DENSE_RANK() over (order by Criteria1) - 1 gives you the number of previous partitions.
ROW_NUMBER() over (partition by Criteria1 order by Criteria2) gives you the number of previous rows within the partition.
It is not really clear what exactly you are trying to get, but it seems below is what you are looking for:
with counts as (select count(distinct cm.criteria1) c1, count(distinct cm.criteria2) c2 from dbo.tbl1)
select cj1.c1, cj2.c2, cm.criteria1, cm.criteria2
from dbo.tbl1 cm
inner join (
select cm1.criteria1,
row_number() over ( order by cm1.criteria1)as c1
from dbo.tbl1 cm1 group by cm1.criteria1) cj1
on cj1.criteria1 = cm.criteria1
inner join (
select cm2.criteria2,
(select counts.c1 from counts) + row_number() over ( order by cm2.criteria2) as c2
from dbo.tbl1 cm2 group by cm1.criteria2) cj2
on cj2.criteria2 = cm.criteria2
group by cm.criteria1, cm.criteria2, cj1.c1, cj2.c2
order by cj1.c1, cj2.c2

How to count most consecutive occurrences of a value in a Column in SQL Server

I have a table Attendance in my database.
Date | Present
------------------------
20/11/2013 | Y
21/11/2013 | Y
22/11/2013 | N
23/11/2013 | Y
24/11/2013 | Y
25/11/2013 | Y
26/11/2013 | Y
27/11/2013 | N
28/11/2013 | Y
I want to count the most consecutive occurrence of a value Y or N.
For example in the above table Y occurs 2, 4 & 1 times. So I want 4 as my result.
How to achieve this in SQL Server?
Any help will be appreciated.
Try this:-
The difference between the consecutive date will remain constant
Select max(Sequence)
from
(
select present ,count(*) as Sequence,
min(date) as MinDt, max(date) as MaxDt
from (
select t.Present,t.Date,
dateadd(day,
-(row_number() over (partition by present order by date))
,date
) as grp
from Table1 t
) t
group by present, grp
)a
where Present ='Y'
SQL FIDDLE
You can do this with a recursive CTE:
;WITH cte AS (SELECT Date,Present,ROW_NUMBER() OVER(ORDER BY Date) RN
FROM Table1)
,cte2 AS (SELECT Date,Present,RN,ct = 1
FROM cte
WHERE RN = 1
UNION ALL
SELECT a.Date,a.Present,a.RN,ct = CASE WHEN a.Present = b.Present THEN ct + 1 ELSE 1 END
FROM cte a
JOIN cte2 b
ON a.RN = b.RN+1)
SELECT TOP 1 *
FROM cte2
ORDER BY CT DESC
Demo: SQL Fiddle
Note, the date's in the demo got altered due to the format you posted the dates in your question.