How to find MAX of COUNT Result for Relation - sql

I have Table which consist of PatientId which is Int and Date which is Date Data Type.
It does look like following
patientId Date
101 01/01/2001
102 01/02/2001
103 01/03/2002
104 01/03/2004
105 01/03/2004
106 01/04/2004
And My Desired Result would give me
Count Year
3 2004
since it has the most patients, also it we have two year that has the same number of patients then we should have both year displayed with Number of patients that they had.
Thank you.

Use YEAR function to extract year from your date column. Use extracted year in group by to get the count of Year
select TOP 1 year([Date]),count(1) as [Count]
from Yourtable
Group by year([Date])
Order by [Count] desc
Another way would be using DATEPART
select TOP 1 Datepart(year,[Date]),count(1) as [Count]
from Yourtable
Group by Datepart(year,[Date])
Order by [Count] desc

The DATEPART function is your friend in this case. However, to get all of the rows in case of a tie, a simple TOP will not work. In this case, a different coding method is needed.
You could use a RANK() command, but that is more complex than this calls for. Instead, use a Common Table Expression (CTE).
Here, I set up a table for testing. Since I need two years with the same count of rows, I extended your sample into 2005
CREATE TABLE MyTable (
custID INT,
[Date] DATE
)
TRUNCATE TABLE MyTable;
INSERT INTO MyTable
VALUES
(101, '01/01/2001'),
(102, '01/02/2001'),
(103, '01/03/2002'),
(104, '01/03/2004'),
(105, '01/03/2004'),
(106, '01/04/2004'),
(107, '02/01/2005'),
(108, '02/02/2005'),
(109, '10/10/2005');
This is the CTE I created, which summarizes the data into its year counts, and the queries against the CTE.
WITH MyData AS (
SELECT
DATEPART(year, [Date]) AS [Year],
COUNT(*) AS ct
FROM MyTable
GROUP BY Datepart(year, [Date])
)
-- Now we issue the SELECT statement against the CTE itself
SELECT *
FROM MyData
WHERE ct = (SELECT MAX(ct) FROM MyData)
And here is the output:
Year ct
2004 3
2005 3

Related

Using SQL Server 2012 how to iterate through an unknown number of rows and calculate date differences

I need to calculate the average number of days if there are two or more dates for each ID: the days between date1 and date2, date2 and date3 etc. The output needs to be the average number of days between each interval per ID. I am looking for a solution that iterates through each date for each ID and then averages the number of days
I could create a row number and partition by the id but in the actual data there can be up to 20 rows for each ID.
CREATE TABLE #ATABLE(
ID INTEGER NOT NULL
,DATE DATE NOT NULL
);
INSERT INTO #ATABLE(ID,DATE) VALUES (1,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/10/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/20/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/30/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (3,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (3,'1/10/2019');
--get avg days between orders
DROP TABLE #ATABLE
The out put for the above would be:
ID AvgDatediff
1 Null
2 10
3 9
You can use lag to get the previous row (per row), and then find the diff between it and the current row. Then, you can average them out:
SELECT id, AVG(diff)
FROM (SELECT id,
DATEDIFF(DAY, date, LAG(date) OVER (PARTITION BY id
ORDER BY date DESC)) AS diff
FROM #atable) t
GROUP BY id;
The simplest way to get the average difference is:
SELECT id, DATEDIFF(DAY, MIN(date), MAX(date)) / NULLIF(COUNT(*) - 1, 0)
FROM #atable) t
GROUP BY id;
Note: You may want a * 1.0 if you don't want an integer average.
In other words, the average difference is the latest date minus the earliest date divided by one less than the count. Try it. It works.
SELECT id, AVG(DayDiff)
FROM (
SELECT id,
DATEDIFF(dd, date, LEAD(date) OVER (PARTITION BY id ORDER BY date)) AS DayDiff
FROM #atable
) as AA
GROUP BY id;
LEAD(source_column) ==> picks the next data on basis of the order by clause i.e. here date.

Aggregation in SQL Server 2014

I have a data something like this:
declare #table table
(
CUSTNO varchar(35),
RELATIONNO int,
Sales numeric(5,2),
RelationDate dateTIME
)
insert into #table
select 'B1024818', 120, 189.26, '2013-10-27' union all
select 'B1024818', 120, 131.76, '2016-10-28' union all
select 'C0002227', 124, 877.16, '2012-08-26' union all
select 'C0002227', 124, 802.65, '2015-06-15'
I am trying to get a result like
CUSTNO RELATIONNO Sales Till Last Relation Year
----------------------------------------------------------
B1024818 120 321.02 2016
C0002227 124 1679.81 2015
Here sales is added for each customer from 1st Relation date to Last Relation date
In a Till Last Relation Year COLUMN it contain highest year for each customer
I am not sure whether it is possible in SQL.
Please share your suggestions.
Thanks
You could use:
SELECT CUSTNO, RELATIONNO, SUM(Sales) AS Sales, MAX(YEAR(RelationDate))
FROM #table
GROUP BY CUSTNO, RELATIONNO;
Rextester Demo
SELECT custno, RELATIONNO, sum(Sales), MAX(year(RelationDate ))
FROM #table
GROUP BY custno, RELATIONNO
you can use below query -
select CUSTNO ,RELATIONNO ,SUM(Sales) as Sales , max(Year(RelationDate )) [Till Last Relation Year]
from #table
group by CUSTNO ,RELATIONNO

How To Select Records in a Status Between Timestamps? T-SQL

I have a T-SQL Quotes table and need to be able to count how many quotes were in an open status during past months.
The dates I have to work with are an 'Add_Date' timestamp and an 'Update_Date' timestamp. Once a quote is put into a 'Closed_Status' of '1' it can no longer be updated. Therefore, the 'Update_Date' effectively becomes the Closed_Status timestamp.
I'm stuck because I can't figure out how to select all open quotes that were open in a particular month.
Here's a few example records:
Quote_No Add_Date Update_Date Open_Status Closed_Status
001 01-01-2016 NULL 1 0
002 01-01-2016 3-1-2016 0 1
003 01-01-2016 4-1-2016 0 1
The desired result would be:
Year Month Open_Quote_Count
2016 01 3
2016 02 3
2016 03 2
2016 04 1
I've hit a mental wall on this one, I've tried to do some case when filtering but I just can't seem to figure this puzzle out. Ideally I wouldn't be hard-coding in dates because this spans years and I don't want to maintain this once written.
Thank you in advance for your help.
You are doing this by month. So, three options come to mind:
A list of all months using left join.
A recursive CTE.
A number table.
Let me show the last:
with n as (
select row_number() over (order by (select null)) - 1 as n
from master..spt_values
)
select format(dateadd(month, n.n, q.add_date), 'yyyy-MM') as yyyymm,
count(*) as Open_Quote_Count
from quotes q join
n
on (closed_status = 1 and dateadd(month, n.n, q.add_date) <= q.update_date) or
(closed_status = 0 and dateadd(month, n.n, q.add_date) <= getdate())
group by format(dateadd(month, n.n, q.add_date), 'yyyy-MM')
order by yyyymm;
This does assume that each month has at least one open record. That seems reasonable for this purpose.
You can use datepart to extract parts of a date, so something like:
select datepart(year, add_date) as 'year',
datepart(month, date_date) as 'month',
count(1)
from theTable
where open_status = 1
group by datepart(year, add_date), datepart(month, date_date)
Note: this counts for the starting month and primarily to show the use of datepart.
Updated as misunderstood the initial request.
Consider following test data:
DECLARE #test TABLE
(
Quote_No VARCHAR(3),
Add_Date DATE,
Update_Date DATE,
Open_Status INT,
Closed_Status INT
)
INSERT INTO #test (Quote_No, Add_Date, Update_Date, Open_Status, Closed_Status)
VALUES ('001', '20160101', NULL, 1, 0)
, ('002', '20160101', '20160301', 0, 1)
, ('003', '20160101', '20160401', 0, 1)
Here is a recursive solution, that doesn't rely on system tables BUT also performs poorer. As we are talking about months and year combinations, the number of recursions will not get overhand.
;WITH YearMonths AS
(
SELECT YEAR(MIN(Add_Date)) AS [Year]
, MONTH(MIN(Add_Date)) AS [Month]
, MIN(Add_Date) AS YMDate
FROM #test
UNION ALL
SELECT YEAR(DATEADD(MONTH,1,YMDate))
, MONTH(DATEADD(MONTH,1,YMDate))
, DATEADD(MONTH,1,YMDate)
FROM YearMonths
WHERE YMDate <= SYSDATETIME()
)
SELECT [Year]
, [Month]
, COUNT(*) AS Open_Quote_Count
FROM YearMonths ym
INNER JOIN #test t
ON (
[Year] * 100 + [Month] <= CAST(FORMAT(t.Update_Date, 'yyyyMM') AS INT)
AND t.Closed_Status = 1
)
OR (
[Year] * 100 + [Month] <= CAST(FORMAT(SYSDATETIME(), 'yyyyMM') AS INT)
AND t.Closed_Status = 0
)
GROUP BY [Year], [Month]
ORDER BY [Year], [Month]
Statement is longer, also more readable and lists all year/month combinations to date.
Take a look at Date and Time Data Types and Functions for SQL-Server 2008+
and Recursive Queries Using Common Table Expressions

Finding most recent date based on consecutive dates

I have s table that lists absences(holidays) of all employees, and what we would like to find out is who is away today, and the date that they will return.
Unfortunately, absences aren't given IDs, so you can't just retrieve the max date from an absence ID if one of those dates is today.
However, absences are given an incrementing ID per day as they are inputt, so I need a query that will find the employeeID if there is an entry with today's date, then increment the AbsenceID column to find the max date on that absence.
Table Example (assuming today's date is 11/11/2014, UK format):
AbsenceID EmployeeID AbsenceDate
100 10 11/11/2014
101 10 12/11/2014
102 10 13/11/2014
103 10 14/11/2014
104 10 15/11/2014
107 21 11/11/2014
108 21 12/11/2014
120 05 11/11/2014
130 15 20/11/2014
140 10 01/03/2015
141 10 02/03/2015
142 10 03/03/2015
143 10 04/03/2015
So, from the above, we'd want the return dates to be:
EmployeeID ReturnDate
10 15/11/2014
21 12/11/2014
05 11/11/2014
Edit: note that the 140-143 range couldn't be included in the results as they appears in the future, and none of the date range of the absence are today.
Presumably I need an iterative sub-function running on each entry with today's date where the employeeID matches.
So based on what I believe you're asking, you want to return a list of the people that are off today and when they are expected back based on the holidays that you have recorded in the system, which should only work only on consecutive days.
SQL Fiddle Demo
Schema Setup:
CREATE TABLE EmployeeAbsence
([AbsenceID] int, [EmployeeID] int, [AbsenceDate] DATETIME)
;
INSERT INTO EmployeeAbsence
([AbsenceID], [EmployeeID], [AbsenceDate])
VALUES
(100, 10, '2014-11-11'),
(101, 10, '2014-11-12'),
(102, 10, '2014-11-13'),
(103, 10, '2014-11-14'),
(104, 10, '2014-11-15'),
(107, 21, '2014-11-11'),
(108, 21, '2014-11-12'),
(120, 05, '2014-11-11'),
(130, 15, '2014-11-20')
;
Recursive CTE to generate the output:
;WITH cte AS (
SELECT EmployeeID, AbsenceDate
FROM dbo.EmployeeAbsence
WHERE AbsenceDate = CAST(GETDATE() AS DATE)
UNION ALL
SELECT e.EmployeeID, e.AbsenceDate
FROM cte
INNER JOIN dbo.EmployeeAbsence e ON e.EmployeeID = cte.EmployeeID
AND e.AbsenceDate = DATEADD(d,1,cte.AbsenceDate)
)
SELECT cte.EmployeeID, MAX(cte.AbsenceDate)
FROM cte
GROUP BY cte.EmployeeID
Results:
| EMPLOYEEID | Return Date |
|------------|---------------------------------|
| 5 | November, 11 2014 00:00:00+0000 |
| 10 | November, 15 2014 00:00:00+0000 |
| 21 | November, 12 2014 00:00:00+0000 |
Explanation:
The first SELECT in the CTE gets employees that are off today with this filter:
WHERE AbsenceDate = CAST(GETDATE() AS DATE)
This result set is then UNIONED back to the EmployeeAbsence table with a join that matches EmployeeID as well as the AbsenceDate + 1 day to find the consecutive days recursively using:
-- add a day to the cte.AbsenceDate from the first SELECT
e.AbsenceDate = DATEADD(d,1,cte.AbsenceDate)
The final SELECT simply groups the cte results by employee with the MAX AbsenceDate that has been calculated per employee.
SELECT cte.EmployeeID, MAX(cte.AbsenceDate)
FROM cte
GROUP BY cte.EmployeeID
Excluding Weekends:
I've done a quick test based on your comment and the below modification to the INNER JOIN within the CTE should exclude weekends when adding the extra days if it detects that adding a day will result in a Saturday:
INNER JOIN dbo.EmployeeAbsence e ON e.EmployeeID = cte.EmployeeID
AND e.AbsenceDate = CASE WHEN datepart(dw,DATEADD(d,1,cte.AbsenceDate)) = 7
THEN DATEADD(d,3,cte.AbsenceDate)
ELSE DATEADD(d,1,cte.AbsenceDate) END
So when you add a day: datepart(dw,DATEADD(d,1,cte.AbsenceDate)) = 7, if it results in Saturday (7), then you add 3 days instead of 1 to get Monday: DATEADD(d,3,cte.AbsenceDate).
You'd need to do a few things to get this data into a usable format. You need to be able to work out where a group begins and ends. This is difficult with this example because there is no straight forward grouping column.
So that we can calculate when a group starts and ends, you need to create a CTE containing all the columns and also use LAG() to get the AbsenceID and EmployeeID from the previous row for each row. In this CTE you should also use ROW_NUMBER() at the same time so that we have a way to re-order the rows into the same order again.
Something like:
WITH
[AbsenceStage] AS (
SELECT [AbsenceID], [EmployeeID], [AbsenceDate]
,[RN] = ROW_NUMBER() OVER (ORDER BY [EmployeeID] ASC, [AbsenceDate] ASC, [AbsenceID] ASC)
,[AbsenceID_Prev] = LAG([AbsenceID]) OVER (ORDER BY [EmployeeID] ASC, [AbsenceDate] ASC, [AbsenceID] ASC)
,[EmployeeID_Prev] = LAG([EmployeeID]) OVER (ORDER BY [EmployeeID] ASC, [AbsenceDate] ASC, [AbsenceID] ASC)
FROM [HR_Absence]
)
Now that we have this we can compare each row to the previous to see if the current row is in a different "group" to the previous row.
The condition would be something like:
[EmployeeID_Prev] IS NULL -- We have a new group if the previous row is null
OR [EmployeeID_Prev] <> [EmployeeID] -- Or if the previous row is for a different employee
OR [AbsenceID_Prev] <> ([AbsenceID]-1) -- Or if the AbsenceID is not sequential
You can then use this to join the CTE to it's self to find the first row in each group with something like:
....
FROM [AbsenceStage] AS [Row]
INNER JOIN [AbsenceStage] AS [First]
ON ([First].[RN] = (
-- Get the first row before ([RN] Less that or equal to) this one where it is the start of a grouping
SELECT MAX([RN]) FROM [AbsenceStage]
WHERE [RN] <= [Row].[RN] AND (
[EmployeeID_Prev] IS NULL
OR [EmployeeID_Prev] <> [EmployeeID]
OR [AbsenceID_Prev] <> ([AbsenceID]-1)
)
))
...
You can then GROUP BY the [First].[RN] which will now act like a group id and allow you to get the start and end date of each absence group.
SELECT
[Row].[EmployeeID]
,MIN([Row].[AbsenceDate]) AS [Absence_Begin]
,MAX([Row].[AbsenceDate]) AS [Absence_End]
...
-- FROM and INNER JOIN from above
...
GROUP BY [First].[RN], [Row].[EmployeeID];
You could then put all that into a view giving you the EmployeeID with the Start and End date of each absence. You can then easily pull out the Employee's currently off with a:
WHERE CAST(CURRENT_TIMESTAMP AS date) BETWEEN [Absence_Begin] AND [Absence_End]
SQL Fiddle
Like another answer here, I'm going to create the leave intervals, but via a different method. First the code:
declare #today date = getdate(); --use whatever date here
with g as (
select *, dateadd(day, -1 * row_number() over (partition by employeeid order by absencedate), AbsenceDate) as group_number
from employeeabsence
) , leave_intervals as (
select employeeid, min(absencedate) as [start], max(absencedate) as [end]
from g
group by EmployeeID, group_number
)
select employeeid, [start], [end]
from leave_intervals
where #today between [start] and [end]
By way of explanation, we first put a date value into a variable. I chose today, but this code will work for any date passed in. Next, we create a common table expression (CTE) that will add on a grouping column to your table. This is the meat of the solution, so it bears some treatment. Within a given interval, the AbsenceDate increases at a rate of one day per row. row_number() also increases at a rate of one per row. So, if we subtract a row_number() number of days from the AbsenceDate, we'll get another (arbitrary) date. The key here is to realize that that arbitrary date will be the same for every row in the interval, so we can use it to group by. From there, it's just a matter of doing just that; get the min and max per interval. Lastly, we find what intervals contain #today.

SQL Count for a Date Column

I have a table that containts a set of columns one of it is a Date column.
I need to count how many occurrences of the values of that column refer to the same month. And return if for one month, that count sums more than 3.
For example:
____________________
| DATE | .... |
---------------------
1998-09-02
1998-09-03
1998-10-03
1998-10-04
This must return no value. Because it doesn't have the necessary number of repetitions.
But this it does:
____________________
| DATE | .... |
---------------------
1998-09-02
1998-09-03
1998-09-12
1998-09-14
1998-10-02
1998-11-21
For the november month.
Is for an Oracle DB.
SELECT
COUNT(date)
, TRUNC(DATE,'MON')
FROM TABLE
GROUP BY TRUNC(DATE,'MON')
HAVING COUNT(DATE) > 3
create table x (date_col date);
insert into x values (date '1998-09-02');
insert into x values (date '1998-09-03');
insert into x values (date '1998-09-12');
insert into x values (date '1998-09-14');
insert into x values (date '1998-10-02');
insert into x values (date '1998-11-21');
SELECT TRUNC(date_col,'MM'), count(*)
FROM x
GROUP BY TRUNC(date_col,'MM')
HAVING count(*) > 3;
So if 3 coloums contain 1999-01-xx you want to get that fetched ?
SELECT YEAR(date), MONTH(date)
FROM table GROUP BY YEAR(date), MONTH(date)
HAVING COUNT(*) > 3
If you need all the rows that contain the upper result it should look something like that
SELECT * FROM table
INNER JOIN (
SELECT YEAR(date) as y, MONTH(date) as m
FROM table GROUP BY YEAR(date), MONTH(date)
HAVING COUNT(*) > 3
) as virtualTable
ON virtualTable.y = YEAR(date) AND virtualTable.m = MONTH(date)
This example will help :
create table d1
( event_date date, event_description varchar2(100));
insert into d1 values (sysdate,'Phone Call');
insert into d1 values (sysdate,'Letter');
insert into d1 values (sysdate-50,'Interview');
insert into d1 values (sysdate-50,'Dinner with parents');
insert into d1 values (sysdate-100,'Birthday');
insert into d1 values (sysdate-100,'Holiday');
insert into d1 values (sysdate-100,'Interview');
insert into d1 values (sysdate-100,'Phone Call');
commit;
select * from d1;
EVENT_DATE EVENT_DESCRIPTION
------------------------- -----------------------------------------------
04-MAR-10 14.47.58 Phone Call
04-MAR-10 14.47.58 Letter
13-JAN-10 14.47.58 Interview
13-JAN-10 14.47.58 Dinner with parents
24-NOV-09 14.47.58 Birthday
24-NOV-09 14.47.58 Holiday
24-NOV-09 14.47.58 Interview
24-NOV-09 14.47.58 Phone Call
8 rows selected
You can see that Nov-09 is the only month which more than 3 events.
Referring back to your original question, which was And return if for one month, that count sums more than 3. The following SQL aggregate will work.
select trunc(event_date,'MONTH'),count('x') from d1
having count('x') > 3 group by trunc(event_date,'MONTH')
Alternatively, use to_char to convert the Date type to a Char with a MON-YYYY picture as follows :
select to_char(trunc(event_date,'MONTH'),'MON-YYYY') month,
count('x') no_of_occurances from d1 having count('x') > 3 group trunc(event_date,'MONTH')
Ideally you should create a stored procedure that accepts the two criteria you need, Month(integer) and limit(integer)
In a parameterized procedure that executes the following
SELECT MONTH(Date) AS TheMonth, COUNT(MONTH(Date)) AS TheMonthCount
FROM MyTable
GROUP BY MONTH(Date)
HAVING (COUNT(MONTH(Date)) > #limit) AND (MONTH(Date) = #month)
To also output the relevant month you could use the following
SELECT CAST(YEAR(Date) AS NVARCHAR) + '.' +
CAST(MONTH(Date) AS NVARCHAR) AS 'The ',
MONTH(Date ) AS TheMonth, COUNT(MONTH(Date)) AS TheMonthCount
FROM Audit_Entry
GROUP BY MONTH(Date),
CAST(YEAR(Date) AS NVARCHAR) + '.' +
CAST(MONTH(Date) AS NVARCHAR)
HAVING (COUNT(MONTH(Date)) > #limit) AND (MONTH(Date) = #month)
This should work for mysql and mssql:
SELECT MONTH(date), Sum(MONTH(date))
FROM table
GROUP BY date
HAVING Sum(MONTH(date)) > 3
I am not sure which database you are using.
In MySQL query will be similar to the method proposed by #THEn
On SQL server you have other interesting possibilities.
Read the this article for more details.
You could use Oracle's EXTRACT method :
select theMonth, sum(monthCount)
from (
select
extract(MONTH FROM t.theDateColumn) as theMonth,
1 as monthCount
)
group by theMonth
having sum(monthCount) >= 3
I don't have an Oracle database at hand at the moment, so this code may not work as is - I apologize for this.
Could be wrong but a guess:
SELECT SUM(date) FROM table
GROUP BY date where SUM(date) > 3