Return Results Of Consecutive Days Per User - sql

I have a table such as the following:
+----+----------------+-------------------------+
| id | employeeNumber | transactionTime |
+----+----------------+-------------------------+
| 1 | 1234 | 2016-02-23 15:11:00.000 |
+----+----------------+-------------------------+
| 2 | 1234 | 2016-02-22 11:01:00.000 |
+----+----------------+-------------------------+
| 3 | 1235 | 2016-02-22 07:22:00.000 |
+----+----------------+-------------------------+
| 4 | 1236 | 2016-02-20 09:16:00.000 |
+----+----------------+-------------------------+
| 5 | 1236 | 2016-02-19 11:01:00.000 |
+----+----------------+-------------------------+
| 6 | 1236 | 2016-02-18 11:44:00.000 |
+----+----------------+-------------------------+
| 7 | 1236 | 2016-02-17 12:12:00.000 |
+----+----------------+-------------------------+
| 8 | 1236 | 2016-02-16 11:09:00.000 |
+----+----------------+-------------------------+
| 9 | 1236 | 2016-02-15 11:19:00.000 |
+----+----------------+-------------------------+
| 10 | 1236 | 2016-02-14 09:12:00.000 |
+----+----------------+-------------------------+
I Need to find a way to return the number of consecutive days that each employee logged a transaction over the past 2 weeks. Such as this:
+------+--------------+-------------------------+-------------------------+
| days |employeeNumber| startTime | endTime |
+------+--------------+-------------------------+-------------------------+
| 2 | 1234 | 2016-02-22 11:01:00.000 | 2016-02-23 15:11:00.000 |
+------+--------------+-------------------------+-------------------------+
| 1 | 1235 | 2016-02-22 11:01:00.000 | 2016-02-22 11:01:00.000 |
+------+--------------+-------------------------+-------------------------+
| 7 | 1236 | 2016-02-14 09:12:00.000 | 2016-02-20 09:16:00.000 |
+------+--------------+-------------------------+-------------------------+
I have been working with the following query, but It only returns a single user and doesn't take into account only the past 2 weeks.
WITH
dates(date) AS (
SELECT DISTINCT CAST(transactionTime AS DATE)
FROM Fuel.dbo.comdata
WHERE employeeNumber = 123456
),
groups AS (
SELECT ROW_NUMBER() OVER (ORDER BY date) AS rn,
DATEADD(DAY, -ROW_NUMBER() OVER (ORDER BY date), date) AS grp,
date
FROM dates
)
SELECT COUNT(*) AS consecutiveDates,
MIN(date) AS minDate, MAX(date) AS maxDate
FROM groups
GROUP BY grp
ORDER BY 1 DESC, 2 DESC
Any help is appreciated.
UPDATE
So I have found the following query very helpful thanks to Gordon Linoff's answer below. However I notice that the Min/Max Dates don't match up to the number of consecutive days. As shown Here with live data:
SELECT * FROM (
SELECT employeeNumber, COUNT(*) AS consecutiveDays,
MIN(transactionTime) AS startTime, MAX(transactionTime) AS endTime
FROM (
SELECT cd.*, DATEADD(DAY, -DENSE_RANK() OVER (PARTITION BY
employeeNumber ORDER BY transactionTime), CAST(transactionTime AS
DATE)) AS grp
FROM Fuel.dbo.comdata cd
WHERE transactionTime >= DATEADD(DAY, -14, GETDATE())
) cd
GROUP BY employeeNumber, grp
) AS tbl1
WHERE consecutiveDays >= 7
+--------------+-------------------------+------------------------+
| empNum | days| startTime | endTime |
+--------+-------------------------------+------------------------+
| 16742 | 7 | 2016-04-28 17:00:00.000 | 2016-05-07 17:04:00.000|
+--------+-------------------------------+------------------------+
| 15056 | 8 | 2016-04-27 13:03:00.000 | 2016-05-08 09:51:00.000|
+--------+-------------------------------+------------------------+
As you can see the number of consecutive days does not match the start/end time. Any ideas?

I would do this with the difference using row number approaches (assuming there is never more than one record per day per employee):
select employee, count(*) as numdays,
min(timestamp) as startTime, max(timestamp) as endTime
from (select cd.*,
dateadd(day,
- row_number() over (partition by employee order by transactionTime),
cast(transactionTime as date)
) as grp
from Fuel.dbo.comdata cd
) cd
group by employee, grp;
The idea is to generate a series of sequential numbers for each employee based on the transactionTime. The difference between this and the transactionTime is constant, when the transactions are on consecutive days.
If you can have multiple transactions on the same day, then you can use dense_rank().
If you have duplicates on the same day:
select employee, count(*) as numdays,
min(timestamp) as startTime, max(timestamp) as endTime
from (select cd.*,
dateadd(day,
- dense_rank() over (partition by employee order by cast(transactionTime as date)),
cast(transactionTime as date)
) as grp
from Fuel.dbo.comdata cd
) cd
group by employee, grp;

Related

Break a date range into hours per day for each job

Yesterday I had asked for an efficient way to break a date range into hours per day and received an answer at the following link...
Is there an efficient way to break a date range into hours per day?
Now I need to go a step further and generate the same thing for each job in a list. I have a table with the following sample information...
+-------+-------------------------+-------------------------+
| JobID | StartDate | EndDate |
+-------+-------------------------+-------------------------+
| 1 | 2015-01-27 07:32:35.000 | 2015-01-28 14:39:35.000 |
| 2 | 2015-01-27 07:32:35.000 | 2015-01-29 16:39:35.000 |
| 3 | 2015-03-02 09:46:25.000 | 2015-03-05 17:24:15.000 |
+-------+-------------------------+-------------------------+
And I need to get a list like the following...
+-------+------------+-------+
| JobID | Date | Hours |
+-------+------------+-------+
| 1 | 2015-01-27 | 16.47 |
| 1 | 2015-01-28 | 14.65 |
| 2 | 2015-01-27 | 16.47 |
| 2 | 2015-01-28 | 24.00 |
| 2 | 2015-01-29 | 16.65 |
| 3 | 2015-03-02 | 14.23 |
| 3 | 2015-03-03 | 24.00 |
| 3 | 2015-03-04 | 24.00 |
| 3 | 2015-03-05 | 17.40 |
+-------+------------+-------+
Can the recursive CTE (from the link I included) be modified to include a JobID?
Thanks,
Carl
Here is what I came up with for a solution...
DECLARE #testTable TABLE (JobID INT, startdate DATETIME, enddate DATETIME);
INSERT INTO #testTable VALUES (1,'2015-01-27 07:32:35.000','2015-01-28 14:39:35.000');
INSERT INTO #testTable VALUES (2,'2015-01-27 07:32:35.000','2015-01-29 16:39:35.000');
INSERT INTO #testTable VALUES (3,'2015-03-02 09:46:25.000','2015-03-02 17:24:15.000');
WITH cte AS (
SELECT JobID,CAST(startdate AS DATE) startdate,DATEDIFF(minute, startdate, DATEADD(DAY, 1, CAST(startdate AS DATE) ) ) / 60.0 hours,enddate from #testTable
UNION ALL
SELECT JobID,DATEADD(DAY,1, startdate), DATEDIFF(minute, DATEADD(DAY,1, startdate), CASE WHEN DATEADD(DAY,2, startdate) > enddate
THEN enddate ELSE DATEADD(DAY,2, startdate) END) / 60.0, enddate
FROM cte
WHERE startdate <> CAST(enddate AS DATE)
)
SELECT * FROM cte
ORDER BY JobID, startdate

Redshift count with variable

Imagine I have a table on Redshift with this similar structure. Product_Bill_ID is the Primary Key of this table.
| Store_ID | Product_Bill_ID | Payment_Date
| 1 | 1 | 01/10/2016 11:49:33
| 1 | 2 | 01/10/2016 12:38:56
| 1 | 3 | 01/10/2016 12:55:02
| 2 | 4 | 01/10/2016 16:25:05
| 2 | 5 | 02/10/2016 08:02:28
| 3 | 6 | 03/10/2016 02:32:09
If I want to query the number of Product_Bill_ID that a store sold in the first hour after it sold its first Product_Bill_ID, how could I do this?
This example should outcome
| Store_ID | First_Payment_Date | Sold_First_Hour
| 1 | 01/10/2016 11:49:33 | 2
| 2 | 01/10/2016 16:25:05 | 1
| 3 | 03/10/2016 02:32:09 | 1
You need to get the first hour. That is easy enough using window functions:
select s.*,
min(payment_date) over (partition by store_id) as first_payment_date
from sales s
Then, you need to do the date filtering and aggregation:
select store_id, count(*)
from (select s.*,
min(payment_date) over (partition by store_id) as first_payment_date
from sales s
) s
where payment_date <= first_payment_date + interval '1 hour'
group by store_id;
SELECT
store_id,
first_payment_date,
SUM(
CASE WHEN payment_date < DATEADD(hour, 1, first_payment_date) THEN 1 END
) AS sold_first_hour
FROM
(
SELECT
*,
MIN(payment_date) OVER (PARTITION BY store_id) AS first_payment_date
FROM
yourtable
)
parsed_table
GROUP BY
store_id,
first_payment_date

Aggregate/Windowed Function To Find Min and Max of Sequential Rows

I've got a SQL table where I want to find the first and last dates of a group of records, providing they're sequential.
Patient | TestType | Result | Date
------------------------------------------
1 | 1 | A | 2012-03-04
1 | 1 | A | 2012-08-19
1 | 1 | B | 2013-05-27
1 | 1 | A | 2013-06-20
1 | 2 | X | 2012-08-19
1 | 2 | X | 2013-06-20
2 | 1 | B | 2014-09-09
2 | 1 | B | 2015-04-19
Should be returned as
Patient | TestType | Result | StartDate | EndDate
--------------------------------------------------------
1 | 1 | A | 2012-03-04 | 2012-08-19
1 | 1 | B | 2013-05-27 | 2013-05-27
1 | 1 | A | 2013-06-20 | 2013-06-20
1 | 2 | X | 2012-08-19 | 2013-06-20
2 | 1 | B | 2014-09-09 | 2015-04-19
The problem is that if I just group by Patient, TestType, and Result,
then the first and third rows in the example above would become a single row.
Patient | TestType | Result | StartDate | EndDate
--------------------------------------------------------
1 | 1 | A | 2012-03-04 | 2013-06-20
1 | 1 | B | 2013-05-27 | 2013-05-27
1 | 2 | X | 2012-08-19 | 2013-06-20
2 | 1 | B | 2014-09-09 | 2015-04-19
I feel like there's got to be something clever I can do with a partition, but I can't quite figure out what it is.
There are several ways to approach this. I like identifying the groups using the difference of row number values:
select patient, testtype, result,
min(date) as startdate, max(date) as enddate
from (select t.*,
(row_number() over (partition by patient, testtype order by date) -
row_number() over (partition by patient, testtype, result order by date)
) as grp
from table t
) t
group by patient, testtype, result, grp
order by patient, startdate;
select patient, testtype, result, date as startdate,
isnull(lead(date) over(partition by patient, testtype, result order by date), date) as enddate
from tablename;
You can use lead function to get the value of date (as enddate) from the next row in each group.
SQL Fiddle with sample data.
See if this gives you what you need.
with T1 as (
select
*,
case when lag(Patient,1)
over (order by Patient, TestType, Result) = Patient
and lag(TestType,1)
over (order by Patient, TestType, Result) = TestType
and lag(Result,1)
over (order by Patient, TestType, Result) = Result
then null else 1 end as Changes
from t
), T2 as (
select
Patient,
TestType,
Result,
dt,
sum(Changes) over (
order by Patient, TestType, Result, dt
) as seq
from T1
)
select
Patient,
TestType,
Result,
min(dt) as dtFrom,
max(dt) as dtTo
from T2
group by Patient, TestType, Result, seq
order by Patient, TestType, Result

Order by Consecutive Numbers in SQL Server Select

I was wondering if there is a way to order by consecutive numbers in SQL Server 2008.
Currently I have
Select DISTINCT StoreNum, StoreName, Date, Time
From tbl_stores
ORDER BY StoreNum, Date
Which will give me
1 | Toronto Store | 2015-03-04 | 12:44:44 |
1 | Toronto Store | 2015-03-04 | 12:44:45 |
2 | Chatham Store | 2015-03-05 | 12:44:47 |
2 | Chatham Store | 2015-03-05 | 12:44:48 |
3 | London Store | 2015-03-06 | 12:44:51 |
3 | London Store | 2015-03-06 | 12:44:52 |
Is it possible to order by StoreNum consecutively then date? Like this
1 | Toronto Store | 2015-03-04 | 12:44:44 |
2 | Chatham Store | 2015-03-05 | 12:44:47 |
3 | London Store | 2015-03-06 | 12:44:51 |
1 | Toronto Store | 2015-03-04 | 12:44:45 |
2 | Chatham Store | 2015-03-05 | 12:44:48 |
3 | London Store | 2015-03-06 | 12:44:52 |
Latest Attempt:
SELECT DISTINCT StoreNum, StoreName, Date, Time,(
Select StoreNum, StoreName, Date, Time,
row_number() over (partition by StoreNum order by Date, Time) as seqnum
From tbl_stores AS q
order by seqnum, StoreNum, Date,Time
)
FROM q
Here is the idea (but without the distinct). Use row_number() to enumerate by the values within each store, then order by that:
Select StoreNum, StoreName, Date, Time,
row_number() over (partition by StoreNum order by Date, Time) as seqnum
From tbl_stores
order by seqnum, StoreNum, Date;
EDIT:
Something like:
Select StoreNum, StoreName, Date, Time,
row_number() over (partition by StoreNum order by Date, Time) as seqnum
From (select distinct StoreNum, StoreName, Date, Time
from tbl_stores s
) s
order by seqnum, StoreNum, Date;

Making Row Entries Pair Horizontally in SQL

So this question is similar to one I've asked before, but slightly different.
I'm looking at data for clients who are admitted to and discharged from a program. For each admit and discharge they have an assessment done and are scored on it and sometimes they are admitted and discharged multiple times during a time period.
I need to be able to pair each clients admit score with their following discharge date so I can look at all clients who improved a certain amount from admit to discharge for each of their admits and discharges.
This is an dummy sample of how my data results are formatted right now:
And this is how I'd ideally like it formatted:
But I'd take any point in the right direction or similar formatting help that would allow me to be able to compare all of the instances of admit and discharge scores for all the clients.
Thanks!
In order to get the result, you can apply both the UNPIVOT and the PIVOT functions. The UNPIVOT will convert your multiple columns of date and score into rows, then you can pivot those rows back into columns.
Then unpivot syntax will be similar to this:
select person,
casenumber,
ScoreType+'_'+col col,
value,
rn
from
(
select person,
casenumber,
convert(varchar(10), date, 101) date,
cast(score as varchar(10)) score,
scoreType,
row_number() over(partition by casenumber, scoretype
order by case scoretype when 'Admit' then 1 end, date) rn
from yourtable
) d
unpivot
(
value
for col in (date, score)
) unpiv
See SQL Fiddle with Demo. This gives a result:
| PERSON | CASENUMBER | COL | VALUE | RN |
-----------------------------------------------------------
| Jon | 3412 | Discharge_date | 01/03/2013 | 1 |
| Jon | 3412 | Discharge_score | 12 | 1 |
| Al | 3452 | Admit_date | 05/16/2013 | 1 |
| Al | 3452 | Admit_score | 15 | 1 |
| Al | 3452 | Discharge_date | 08/01/2013 | 1 |
| Al | 3452 | Discharge_score | 13 | 1 |
As you can see this query also creates the new columns to then pivot. So the final code will be:
select person, casenumber,
Admit_Date, Admit_Score, Discharge_Date, Discharge_Score
from
(
select person,
casenumber,
ScoreType+'_'+col col,
value,
rn
from
(
select person,
casenumber,
convert(varchar(10), date, 101) date,
cast(score as varchar(10)) score,
scoreType,
row_number() over(partition by casenumber, scoretype
order by case scoretype when 'Admit' then 1 end, date) rn
from yourtable
) d
unpivot
(
value
for col in (date, score)
) unpiv
) src
pivot
(
max(value)
for col in (Admit_Date, Admit_Score, Discharge_Date, Discharge_Score)
) piv;
See SQL Fiddle with Demo. This gives a result:
| PERSON | CASENUMBER | ADMIT_DATE | ADMIT_SCORE | DISCHARGE_DATE | DISCHARGE_SCORE |
-------------------------------------------------------------------------------------
| Al | 3452 | 05/16/2013 | 15 | 08/01/2013 | 13 |
| Cindy | 6578 | 01/02/2013 | 17 | 03/04/2013 | 14 |
| Cindy | 6578 | 03/04/2013 | 14 | 03/18/2013 | 12 |
| Jon | 3412 | (null) | (null) | 01/03/2013 | 12 |
| Kevin | 9868 | 01/18/2013 | 19 | 03/02/2013 | 15 |
| Kevin | 9868 | 03/02/2013 | 15 | (null) | (null) |
| Pete | 4765 | 02/06/2013 | 15 | (null) | (null) |
| Susan | 5421 | 04/06/2013 | 19 | 05/07/2013 | 15 |
SELECT
ad.person, ad.CaseNumber, ad.Date as AdmitScoreDate, ad.Score as AdmitScore,
dis.date as DischargeScoreDate, dis.Score as DischargeScore
From
yourTable ad, yourTable dis
WHERE
ad.person=dis.person
and
ad.ScoreType='Admit'
and d
is.ScoreType='Discharge';
If all the columns you mentioned are in the same table, you can join on same table
SELECT t1.person,
t1.caseNumber,
t1.date adate,
t1.score ascore,
t1.scoreType ascoreType,
t2.date ddate,
t2.score dscore,
t2.scoreType dscoretype
FROM patient t1
join patient t2
on t1.casenumber=t2.casenumber
and t1.scoreType!=t2.scoreType
and t1.scoreType='Admit'
But this will not show you record of people who have been admitted and not discharged yet. I don't know if you were also looking for that information.
SQL Fiddle link
Hope this helps!