Distinct counts based on dates in SQL Server

Distinct counts based on dates in SQL Server - sql

I'm attempting to create a new table based on the following table:
SubjectNumber TestDates
001 11/12/12
001 01/10/15
001 04/03/13
002 05/21/14
003 08/06/15
002 09/12/18
002 03/30/12
003 09/07/18
004 10/14/11
005 02/05/14
005 02/06/14
I need a new table that will include the following:
1) Subject number
2) Their first test date
3) Their last test date
4) A count of the total number of tests
5) A column with 0's and 1's indicating whether or not the subject had any two test dates that were at least 30 days apart
The new table should look like the following:
SubjectNumber FirstTestDate LastTestDate TestCount ThirtyDaysApart
001 11/12/12 01/10/15 3 1
002 03/30/12 09/12/18 3 1
003 08/06/15 09/07/18 2 1
004 10/14/11 1 0
005 02/05/14 02/06/14 2 0
I'm using SQL Server 2017.
I have a temporary table called #Temp1 that I'd like to store the data in. This table is called #Temp.
Insert into #Temp1
SELECT SubjectNumber, WHERE
CASE MIN(TestDates) then FirstTestDate = TestDates
END
CASE MAX(TestDates) then LastTestDate = TestDates
END
FROM #Temp;

You can use lag() and conditional aggregation:
select subjectnumber, min(testdate), max(testdate),
max(case when prev_testdate < dateadd(day, -30, testdate) then 1 else 0 end) as diff30
from (select t.*,
lag(testdate) over (partition by subjectnumber order by testdate) as prev_testdate
from t
) t
group by subjectnumber;

You can try using lag() function
select subjectnumber,min(TestDates),max(TestDates),count(TestDates),
case when count(case when pdatediff=30 then 1 end)>=2 then 1 else 0 end as ThirtyDaysApart
from
(
select subjectnumber,TestDates,COALESCE (DATEDIFF(DAY,
LAG(TestDates) OVER (PARTITION BY subjectnumber
ORDER BY TestDates), TestDates) ,0) as pdatediff
from tablenmae
)X group by subjectnumber

The only tricky part is checking if two dates within a group are 30 days apart. Note that the following query returns 1 if any two dates, not necessarily consecutive, are 30 days apart:
WITH cte AS (
SELECT SubjectNumber, MIN(TestDates) FirstTestDate, MAX(TestDates) LastTestDate, COUNT(TestDates) TestCount
FROM #yourdata
GROUP BY SubjectNumber
)
SELECT *
FROM cte AS t
CROSS APPLY (
SELECT CASE WHEN COUNT(*) = 0 THEN 0 ELSE 1 END AS ThirtyDaysApart
FROM #yourdata AS o
INNER JOIN #yourdata AS n ON o.SubjectNumber = n.SubjectNumber AND n.TestDates >= DATEADD(DAY, 30, o.TestDates)
WHERE o.SubjectNumber = t.SubjectNumber
) AS CA
DB Fiddle

Related

How to merge two query results joining same date

let's say there's a table have data like below
id
status
date
1
4
2022-05
2
3
2022-06
I want find count of id of each month by their status. Something like this below
date
count(status1) = 4
count(status2) =3
2022-05
1
null
2022-06
null
1
I tried doing
-- select distinct (not working)
select date, status1, status2 from
(select date, count(id) as "status1" from myTable
where status = 4 group by date) as myTable1
join
(select date, count(id) as "status2" from myTable
where status = 3 group by date) as myTable2
on myTable1.date = myTable2.date;
-- group by (not working)
but it does duplicate the data needed.
and I am using SQL Server.

select d.date,
sum
(
case
when d.status=4 then 1
else 0
end
)count_status_4,
sum
(
case
when d.status=5 then 1
else 0
end
)count_status_5
from your_table as d
group by d.date

Identify if a value is appearing for the first time in a column

I'm trying to identify if the value of a column (let's call it status for now) is appearing for the first time for a given ID, or if it's previously held that value before. If its the first time that the status is equal to 1 or 2, I'd like to return a 1=Y or a 0=N. The results below are what I'm looking to recreate. Any help is greatly appreciated. Thanks
ID Date Status First Time
1 1/1/2017 1 Y
2 1/1/2017 0 N
3 1/1/2017 0 N
4 1/1/2017 2 Y
5 1/1/2017 0 N
1 2/1/2017 0 N
2 2/1/2017 0 N
3 2/1/2017 1 Y
4 2/1/2017 0 N
5 2/1/2017 1 Y
1 3/1/2017 2 N
2 3/1/2017 0 N
3 3/1/2017 0 N
4 3/1/2017 1 N
5 3/1/2017 1 N

SELECT t1.ID, t1.Date, t1.status
, CASE WHEN firsts.ID IS NULL THEN 'N' ELSE 'Y' END AS `First Time`
FROM theTable AS t1
LEFT JOIN
(SELECT ID, MIN(Date) AS fDate
FROM theTable AS t0
WHERE status IN (1, 2)
GROUP BY ID
) AS firsts
ON t1.ID = firsts.ID AND t1.status IN (1, 2) AND t1.Date = firsts.fDate
;
You can use a subquery to identify the firsts, and join to the original records to "add" that information.
Note: This could lead to some redundant firsts if an ID's first 1 and 2 status values occur on the same day; but the current definition of "first" does't really state which status would be "first".

Just check if Status is 1 or 2, and whether there exists a record with the same ID and at an earlier date with Status 1 or 2.
select ID, Date, Status,
case
when a.Status in (1,2) and not exists(select * from table b where a.ID = b.ID and b.Status in (1,2) and b.Date < a.Date) then 'Y'
else 'N'
end as First
from table a

Something like this may work for you. The ROW_NUMBER function ordered by date and partitioned by status will add a counter that, within each group of similar statuses, will increment from 1 according to ascending date.
Combining this with a CASE statement, we can say that the status a 'first time' when that counter is either a 1 or two.
SELECT
*
, CASE WHEN ROW_NUMBER() OVER (PARTITION BY [Status] ORDER BY [date]) IN (1,2) THEN 'Y' ELSE 'N' END
FROM [MY_TABLE]
EDIT:
As noted below, this is a SQL Sever solution. You would make some tweaks to get it in MYSQL.

Working solution, if I understand the problem correctly:
Test Table Creation:
WITH Test (ID, [Date], [Status]) AS (
SELECT * FROM (
VALUES
(1,Convert(datetime, '1/1/2017', 120),1),
(2,Convert(datetime, '1/1/2017', 120),0),
(3,Convert(datetime, '1/1/2017', 120),0),
(4,Convert(datetime, '1/1/2017', 120),2),
(5,Convert(datetime, '1/1/2017', 120),0),
(1,Convert(datetime, '2/1/2017', 120),0),
(2,Convert(datetime, '2/1/2017', 120),0),
(3,Convert(datetime, '2/1/2017', 120),1),
(4,Convert(datetime, '2/1/2017', 120),0),
(5,Convert(datetime, '2/1/2017', 120),1),
(1,Convert(datetime, '3/1/2017', 120),2),
(2,Convert(datetime, '3/1/2017', 120),0),
(3,Convert(datetime, '3/1/2017', 120),0),
(4,Convert(datetime, '3/1/2017', 120),1),
(5,Convert(datetime, '3/1/2017', 120),1)
) AS A (Column1, Column2, Column3)
)
Query:
SELECT id,
[Date],
(SELECT [Status] FROM Test b
WHERE a.ID = b.ID
AND a.[Date] = b.[Date]) AS [Status],
CASE WHEN (SELECT MIN([Date])
FROM Test
WHERE [Status] IN (1, 2)
AND ID = a.ID
GROUP BY ID
) = a.[Date] THEN 'Y'
ELSE 'N'
END AS FirstTime
FROM Test a
GROUP BY ID, [Date]

Count the cycle, and count that already has been counted

I have my query:
SELECT UserGroupCode,COUNT(UserGroupCode) AS [CountofCycle]
FROM Users.GroupCycles
GROUP BY UserGroupCode;
Which shows me:
UserGroupCode CountofCycles
1 1
4 1
5 1
6 2 (gone into 2nd cycle)
7 1
8 1
9 1
10 1
11 1
12 1
13 1
14 1
15 1
16 1
17 1
18 1
19 1
When i try to count Total UserGroups where countofcycle=1
SELECT Count(t.CountOfCycle) AS 'totalgroups'
FROM
(SELECT CreateDate, COUNT(userGroupCode) AS [CountofCycle]
FROM Users.GroupCycles
GROUP BY CreateDate,UserGroupCode)t
WHERE CountofCycle=1
I get result = 18 which should be 16, if i delete CreateDate from both SELECT And GROUP BY statement i can get correct number of CountofCycles,
and when i change condition to CountofCycle=2 or >1 it shows me 0
What is the problem with showing UserGroups with cycle > 1 ???!??
Here is my query to filter out onCreateDate, in 2nd table that i UNION with 1st one, i cant't use CreateDate, as it disturbs my query results
SELECT Count(t.CountOfCycle) AS 'total groups'
FROM
(SELECT COUNT(userGroupCode) AS [CountofCycle], CreateDate
FROM users.GroupCycles GROUP BY userGroupCode,CreateDate)t
WHERE t.CountOfCycle=1 AND t.CreateDate Between '03/16/2017' AND '04/25/2017'
UNION ALL
SELECT Count(t.CountOfCycle) AS 'group on date2'
FROM
(SELECT COUNT(userGroupCode) AS [CountofCycle] FROM users.GroupCycles GROUP BY userGroupCode)t
WHERE t.CountOfCycle=2

Firstly to address why you are not getting the results you are expecting, and the simple reason is that you are comparing two different queries and expecting the results to be the same.
Consider this very simple example data
UserGroupCode | CreateDate
----------------+----------------
A | 2017-05-10
B | 2017-05-10
B | 2017-05-11
C | 2017-05-10
You have two records where the UserGroupCode is B, so if you run:
DECLARE #T TABLE (UserGroupCode CHAR(1), CreateDate DATE)
INSERT #T (userGroupCode, CreateDate)
VALUES ('A', '2017-05-10'), ('B', '2017-05-10'), ('B', '2017-05-11');
SELECT UserGroupCode, COUNT(*) AS [Count]
FROM #T
GROUP BY UserGroupCode
HAVING COUNT(*) = 2;
This returns:
UserGroupCode Count
-------------------------
B 2
However, if you were to add CreateDate to the grouping, "B" would be split into two groups, each with a count of 1:
DECLARE #T TABLE (UserGroupCode CHAR(1), CreateDate DATE)
INSERT #T (userGroupCode, CreateDate)
VALUES ('A', '2017-05-10'), ('B', '2017-05-10'), ('B', '2017-05-11');
SELECT UserGroupCode, CreateDate, COUNT(*) AS [Count]
FROM #T
WHERE UserGroupCode = 'B'
GROUP BY UserGroupCode, CreateDate;
This returns:
UserGroupCode CreateDate Count
---------------------------------------
B 2017-05-10 1
B 2017-05-11 1
Now, based you your queries you have posted, it looks like you want to know
The number of groups that only have one record in the date range 16th March 2017 to 25th April 2017.
The number of groups that have two records in total.
For this, consider a slightly larger data set:
UserGroupCode | CreateDate
----------------+----------------
A | 2017-04-10
B | 2017-04-10
B | 2017-05-11
C | 2017-01-01
C | 2017-01-02
D | 2017-04-01
D | 2017-04-02
E | 2017-01-02
So here.
Group A has one record in total, and it falls within the date range
Group B has two records in total, on in the date range, one not
Group C has two records, neither in the date range
Group D has two records, both in the date range.
Group E has one record, not in the date range
So for your first requirement:
The number of groups that only have one record in the date range 16th March 2017 to 25th April 2017.
We would expect 2 groups, A and B, because C and E have no records in the date range, and D has two.
and the second we would expect three groups, B, C and D, since A and E only have one record each.
You can do this with a single query by using a conditional aggregate.
DECLARE #T TABLE (UserGroupCode CHAR(1), CreateDate DATE)
INSERT #T (userGroupCode, CreateDate)
VALUES ('A', '2017-04-10'),
('B', '2017-04-10'), ('B', '2017-05-11'),
('C', '2017-01-01'), ('C', '2017-01-02'),
('D', '2017-04-01'), ('D', '2017-04-02'),
('E', '2017-01-02');
SELECT TotalGroups = COUNT(CASE WHEN RecordsInPeriod = 1 THEN 1 END),
GroupOnDate2 = COUNT(CASE WHEN TotalRecords = 2 THEN 1 END)
FROM ( SELECT UserGroupCode,
TotalRecords = COUNT(*),
RecordsInPeriod = COUNT(CASE WHEN CreateDate >= '20170316'
AND CreateDate <= '20170425' THEN 1 END)
FROM #T
GROUP BY UserGroupCode
) AS t;
Which gives:
TotalGroups GroupOnDate2
------------------------------
2 3

I'd expect to see a HAVING clause rather than a WHERE:
SELECT UserGroupCode, COUNT(UserGroupCode) [CountofCycle]
FROM [Users].[GroupCycles]
GROUP BY UserGroupCode
HAVING COUNT(UserGroupCode) > 1;

You could use HAVING, should work (and be more efficient)
select count(*)
from
(
SELECT CreateDate, COUNT(userGroupCode) AS [CountofCycle]
FROM Users.GroupCycles
GROUP BY CreateDate,UserGroupCode
having count(userGroupCode) > 1 -- here is HAVING clause
) x1

How to count most consecutive occurrences of a value in a Column in SQL Server

I have a table Attendance in my database.
Date | Present
------------------------
20/11/2013 | Y
21/11/2013 | Y
22/11/2013 | N
23/11/2013 | Y
24/11/2013 | Y
25/11/2013 | Y
26/11/2013 | Y
27/11/2013 | N
28/11/2013 | Y
I want to count the most consecutive occurrence of a value Y or N.
For example in the above table Y occurs 2, 4 & 1 times. So I want 4 as my result.
How to achieve this in SQL Server?
Any help will be appreciated.

Try this:-
The difference between the consecutive date will remain constant
Select max(Sequence)
from
(
select present ,count(*) as Sequence,
min(date) as MinDt, max(date) as MaxDt
from (
select t.Present,t.Date,
dateadd(day,
-(row_number() over (partition by present order by date))
,date
) as grp
from Table1 t
) t
group by present, grp
)a
where Present ='Y'
SQL FIDDLE

You can do this with a recursive CTE:
;WITH cte AS (SELECT Date,Present,ROW_NUMBER() OVER(ORDER BY Date) RN
FROM Table1)
,cte2 AS (SELECT Date,Present,RN,ct = 1
FROM cte
WHERE RN = 1
UNION ALL
SELECT a.Date,a.Present,a.RN,ct = CASE WHEN a.Present = b.Present THEN ct + 1 ELSE 1 END
FROM cte a
JOIN cte2 b
ON a.RN = b.RN+1)
SELECT TOP 1 *
FROM cte2
ORDER BY CT DESC
Demo: SQL Fiddle
Note, the date's in the demo got altered due to the format you posted the dates in your question.

Find next date value in the column

I have a large table with the following columns and sample values:
ID Ser Reg Date
1 12345 001 1/3/2011
1 12345 001 2/2/2011
1 12345 002 1/3/2011
1 12345 002 2/2/2011
2 23456 001 1/3/2011
2 23456 001 2/7/2011
2 23456 001 3/5/2011
I tried this query from a previous post SQL - Select next date query - but did not get the desired results:
SELECT
mytable.id,
mytable.date,
(
SELECT
MIN(mytablemin.date)
FROM mytable AS mytablemin
WHERE mytablemin.date > mytable.date
) AS NextDate
FROM mytable
This is what I am trying to accomplish:
ID Ser Reg curr_Date prev_Date
1 12345 001 2/2/2011 1/3/2011
1 12345 002 2/2/2011 1/3/2011
2 23456 001 2/7/2011 1/5/2011
2 23456 001 3/5/2011 2/7/2011
I would appreciate any help with this task.

if you are using oracle database (as you have not mentioned then I can assume anything)
then you can use lead and lag function/command for this ..
select id,ser, reg, curr_date ,prev_date
from
(
select id,ser, reg, ser, date curr_date
LEAD(date, 1, 0) OVER (PARTITION BY id,ser, reg, curr_date ORDER BY date DESC NULLS LAST) prev_date,
)
where prev_date is not null;

There was a condition missing from correlated subquery joining mytablemin copy of mytable table with mytable. Also you would eliminate records which do not have NextDate - but this might give incorrect results in case when only one record in group (Id, Ser, Reg) exists by eliminating it from result set.
select * from
(
SELECT
mytable.id,
mytable.date,
(
SELECT
MIN(mytablemin.date)
FROM mytable AS mytablemin
WHERE mytablemin.date > mytable.date
and mytablemin.id = mytable.id
and mytablemin.Ser = mytable.Ser
and mytablemin.Reg = mytable.Reg
) AS NextDate
FROM mytable
) a
where a.NextDate is not null
And here is version using derived table with aggregation:
SELECT
mytable.id,
mytable.date,
mytablemin.minDate
FROM mytable
inner join
(
SELECT mytablemin.id,
mytablemin.Ser,
mytablemin.Reg,
MIN(mytablemin.date) minDate
FROM mytable AS mytablemin
group by mytablemin.id,
mytablemin.Ser,
mytablemin.Reg
having MIN(mytablemin.date) is not null
) AS mytablemin
on mytablemin.id = mytable.id
and mytablemin.Ser = mytable.Ser
and mytablemin.Reg = mytable.Reg

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Distinct counts based on dates in SQL Server - sql

Related

How to merge two query results joining same date

Identify if a value is appearing for the first time in a column

Count the cycle, and count that already has been counted

How to count most consecutive occurrences of a value in a Column in SQL Server

Find next date value in the column

Categories

Resources