How to count consecutive days and group by year - sql

I want to count consecutive days (rows) and that is fairly easy (given all the answers to similar questions). But in my data set I have groups of consecutive rows with dates such as:
1. 30/12/2010
2. 31/12/2010
3. 01/01/2011
4. 02/01/2011
Looks like a one group (4 consecutive days), but I would like to split this group into two groups. So when having:
1. 30/12/2010
2. 31/12/2010
3. 01/01/2011
4. 02/01/2011
5. 05/01/2011
6. 06/02/2011
7. 07/02/2011
I would like to see this grouped into four groups (not three):
1. 30/12/2010
2. 31/12/2010
3. 01/01/2011
4. 02/01/2011
5. 05/01/2011
6. 06/02/2011
7. 07/02/2011
I'm using SQL Server 2014

You can number your rows like this:
DECLARE #T TABLE(id INT, dt DATE);
INSERT INTO #T VALUES
(1, '2010-12-30'),
(2, '2010-12-31'),
(3, '2011-01-01'),
(4, '2011-01-02'),
(5, '2011-01-05'),
(6, '2011-02-06'),
(7, '2011-02-07');
WITH CTE1 AS (
SELECT *, YEAR(dt) AS temp_year, ROW_NUMBER() OVER (ORDER BY dt) AS temp_rownum
FROM #T
), CTE2 AS (
SELECT CTE1.*, DATEDIFF(DAY, temp_rownum, dt) AS temp_dategroup
FROM CTE1
)
SELECT *, RANK() OVER (ORDER BY temp_year, temp_dategroup) AS final_rank
FROM CTE2
ORDER BY final_rank, dt
Result:
id dt temp_year temp_rownum temp_dategroup final_rank
1 2010-12-30 2010 1 40539 1
2 2010-12-31 2010 2 40539 1
3 2011-01-01 2011 3 40539 3
4 2011-01-02 2011 4 40539 3
5 2011-01-05 2011 5 40541 5
6 2011-02-06 2011 6 40572 6
7 2011-02-07 2011 7 40572 6
It is possible to use simplify the query but I chose to display all columns so that it is easier to understand. The DATEDIFF trick was copied from this answer.

Related

How to select rows where logged in last month and logged min 1 time in one of month preceding August in Oracle SQL?

I have table in Oracle SQL presents ID of clients and date with time of their login to application:
ID | LOGGED
----------------
11 | 2021-07-10 12:55:13.278
11 | 2021-08-10 13:58:13.211
11 | 2021-02-11 12:22:13.364
22 | 2021-01-10 08:34:13.211
33 | 2021-04-02 14:21:13.272
I need to select only these clients (ID) who has logged minimum 1 time in last month (August) and minimum 1 time in one month preceding August (June or July)
Currently we have September, so...
I need clients who has logged min 1 time in August
and min 1 time in July or Jun,
if logged in June -> not logg in July
if logged in July -> not logged in June
As a result I need like below:
ID
----
11
How can do that in Oracle SQL ? be aware that column "LOGGED" has Timestamp like: 2021-01-10 08:34:13.211
May be you consider this:
select id
from yourtable
group by id
having count(case
months_between(trunc(sysdate,'MM'),
trunc(logged,'MM')
) when 1 then 1 end
) >= 1
and count
(case when
months_between(trunc(sysdate,'MM') ,
trunc(logged,'MM')
) in (2,3) then 1 end
) = 1
I don't understand one thing:
You wrote :
minimum 1 time in one month preceding August (June or July)
and after then:
if logged in June -> not logg in July
if logged in July -> not logged in June
If you need EXACTLY one month- June or July
just consider my query above.
If you need minimum one logon in June and July, then:
select id
from yourtable
group by id
having count(case
months_between(trunc(sysdate,'MM'),
trunc(logged,'MM')
) when 1 then 1 end
) >= 1
and count
(case when
months_between(trunc(sysdate,'MM') ,
trunc(logged,'MM')
) in (2,3) then 1 end
) >= 1
Your question needs some clarification, but based on what you were describing I am seeing a couple of options.
The simplest one is probably using a combo of data densification (for generating a row for every month for each id) plus an analytical function (for enabling inter-row calculations. Here's a simple example of this:
rem create a dummy table with some more data (you do not seem to worry about the exact timestamp)
drop table logs purge;
create table logs (ID number, LOGGED timestamp);
insert into logs values (11, to_timestamp('2021-07-10 12:55:13.278','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (11, to_timestamp('2021-07-11 12:55:13.278','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (11, to_timestamp('2021-08-10 13:58:13.211','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (11, to_timestamp('2021-02-11 12:22:13.364','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (11, to_timestamp('2021-04-11 12:22:13.364','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (22, to_timestamp('2021-01-10 08:34:13.211','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (33, to_timestamp('2021-04-02 14:21:13.272','yyyy-mm-dd HH24:MI:SS.FF'));
commit;
The following SQL gets your data densified and lists the total count of logins for a month and the previous month on the same row so that you could do a comparative calculation. I have not done then, but I am hoping you get the idea.
with t as
(-- dummy artificial table just to create a time dimension for densification
select distinct to_char(sysdate - rownum,'yyyy-mm') mon
from dual connect by level < 300),
l_sparse as
(-- aggregating your login info per month
select id, to_char(logged,'yyyy-mm') mon, count(*) cnt
from logs group by id, to_char(logged,'yyyy-mm') ),
l_dense as
(-- densification with partition outer join
select t.mon, l.id, cnt from l_sparse l partition by (id)
right outer join t on (l.mon = t.mon)
)
-- final analytical function to list current and previous row info in same record
select mon, id
, cnt
, lag(cnt) over (partition by id order by mon asc) prev_cnt
from l_dense
order by id, mon;
parts of the result:
MON ID CNT PREV_CNT
------- ---------- ---------- ----------
2020-12 11
2021-01 11
2021-02 11 2
2021-03 11 2
2021-04 11 1
2021-05 11 1
2021-06 11
2021-07 11 3
2021-08 11 2 3
2021-09 11 2
2020-12 22
2021-01 22 2
2021-02 22 2
2021-03 22
2021-04 22
...
You can see for ID 11 that for 2021-08 you have logins for the current and previous month, so you can math on it. (Would require another subselect/with branch).
Alternatives to this would be:
interrow calculation plus time math between two logged timestamps
pattern matching
Did not drill into those, not enough info about your real requirement.

SQL Server multiple partitions by month, day and hour

In SQL Server I have a table like below:
processName initDateTime
processA 2020-06-15 13:31:15.330
processB 2020-06-20 10:00:30.000
processA 2020-06-20 13:31:15.330
...
and so on
I need to group by processName and for each processName I need to get the number of records by month (#byMonth), day (#byDay) and hour (#byHour).
What is the best way to do it? Something as below? What would be the SQL query?
Possible results:
processName Month Day Hour #byMonth #byDay #byHour #total(by process)
processA January 15 17 4 3 2 7
processA January 15 20 4 3 1 7
processA January 20 05 4 2 3 7
processA January 20 13 4 2 1 7
processA March 04 05 3 2 3 7
processA March 04 17 3 2 2 7
processA March 15 05 3 3 3 7
...and so on for the rest of processes name
I think that you want aggregation and window functions:
select
processName,
month(initDateTime),
day(initDateTime),
datepart(hour, initDateTime),
sum(count(*)) over(partition by processName, year(initDateTime), month(initDateTime)) byMonth,
sum(count(*)) over(partition by processName, year(initDateTime), month(initDateTime), day(initDateTime)) byDay,
count(*) byHour
from mytable
group by
processName,
year(initDateTime),
month(initDateTime),
day(initDateTime),
datepart(hour, initDateTime)
Wherever possible, I like to return dates as dates to the caller, so that they can also treat them as dates for things such as sorting, converting to local time, or even making sure that the language shown is relevant. So if it were me, i would do the following:
-- sample data
CREATE TABLE #T (processName VARCHAR(50), initDateTime DATETIME)
INSERT #T (processName, initDateTime)
VALUES
('processA', '2020-06-15 13:31:15.330'),
('processB', '2020-06-20 10:00:30.000'),
('processA', '2020-06-20 13:31:15.330')
SELECT t.processName,
i.InitHour,
ByMonth = SUM(COUNT(*)) OVER(PARTITION BY i.InitMonth),
ByDay = SUM(COUNT(*)) OVER(PARTITION BY i.InitDay),
ByHour = COUNT(*)
FROM #T AS t
CROSS APPLY
( SELECT InitHour = DATEADD(HOUR, DATEDIFF(HOUR, 0, initDateTime), 0),
InitDay = DATEADD(DAY, DATEDIFF(DAY, 0, initDateTime), 0),
InitMonth = DATEADD(MONTH, DATEDIFF(MONTH, 0, initDateTime), 0)
) AS i
GROUP BY t.processName, i.InitHour, i.InitDay, i.InitMonth;
Which returns:
processName InitHour ByMonth ByDay ByHour
--------------------------------------------------------------
processA 2020-06-15 13:00:00 3 1 1
processA 2020-06-20 13:00:00 3 2 1
processB 2020-06-20 10:00:00 3 2 1
If you need the day number, month name etc in SQL, you can get these using DATEPART or DATENAME, but as above, this is really better handled in the presentation layer, so you can deal with locales, or specific user settings.

How to aggregate 7 days in SQL

I was trying to aggregate a 7 days data for FY13 (starts on 10/1/2012 and ends on 9/30/2013) in SQL Server but so far no luck yet. Could someone please take a look. Below is my example data.
DATE BREAD MILK
10/1/12 1 3
10/2/12 2 4
10/3/12 2 3
10/4/12 0 4
10/5/12 4 0
10/6/12 2 1
10/7/12 1 3
10/8/12 2 4
10/9/12 2 3
10/10/12 0 4
10/11/12 4 0
10/12/12 2 1
10/13/12 2 1
So, my desired output would be like:
DATE BREAD MILK
10/1/12 1 3
10/2/12 2 4
10/3/12 2 3
10/4/12 0 4
10/5/12 4 0
10/6/12 2 1
Total 11 15
10/7/12 1 3
10/8/12 2 4
10/9/12 2 3
10/10/12 0 4
10/11/12 4 0
10/12/12 2 1
10/13/12 2 1
Total 13 16
--------through 9/30/2013
Please note, since FY13 starts on 10/1/2012 and ends on 9/30/2012, the first week of FY13 is 6 days instead of 7 days.
I am using SQL server 2008.
You could add a new computed column for the date values to group them by week and sum the other columns, something like this:
SELECT DATEPART(ww, DATEADD(d,-2,[DATE])) AS WEEK_NO,
SUM(Bread) AS Bread_Total, SUM(Milk) as Milk_Total
FROM YOUR_TABLE
GROUP BY DATEPART(ww, DATEADD(d,-2,[DATE]))
Note: I used DATEADD and subtracted 2 days to set the first day of the week to Monday based on your dates. You can modify this if required.
Use option with GROUP BY ROLLUP operator
SELECT CASE WHEN DATE IS NULL THEN 'Total' ELSE CONVERT(nvarchar(10), DATE, 101) END AS DATE,
SUM(BREAD) AS BREAD, SUM(MILK) AS MILK
FROM dbo.test54
GROUP BY ROLLUP(DATE),(DATENAME(week, DATE))
Demo on SQLFiddle
Result:
DATE BREAD MILK
10/01/2012 1 3
10/02/2012 2 4
10/03/2012 2 3
10/04/2012 0 4
10/05/2012 4 0
10/06/2012 2 1
Total 11 15
10/07/2012 1 3
10/08/2012 4 7
10/10/2012 0 4
10/11/2012 4 0
10/12/2012 2 1
10/13/2012 2 1
Total 13 16
You are looking for a rollup. In this case, you will need at least one more column to group by to do your rollup on, the easiest way to do that is to add a computed column that groups them into weeks by date.
Take a lookg at: Summarizing Data Using ROLLUP
Here is the general idea of how it could be done:
You need a derived column for each row to determine which fiscal week that record belongs to. In general you could subtract that record's date from 10/1, get the number of days that have elapsed, divide by 7, and floor the result.
Then you can GROUP BY that derived column and use the SUM aggregate function.
The biggest wrinkle is that 6 day week you start with. You may have to add some logic to make sure that the weeks start on Sunday or whatever day you use but this should get you started.
The WITH ROLLUP suggestions above can help; you'll need to save the data and transform it as you need.
The biggest thing you'll need to be able to do is identify your weeks properly. If you don't have those loaded into tables already so you can identify them, you can build them on the fly. Here's one way to do that:
CREATE TABLE #fy (fyear int, fstart datetime, fend datetime);
CREATE TABLE #fylist(fyyear int, fydate DATETIME, fyweek int);
INSERT INTO #fy
SELECT 2012, '2011-10-01', '2012-09-30'
UNION ALL
SELECT 2013, '2012-10-01', '2013-09-30';
INSERT INTO #fylist
( fyyear, fydate )
SELECT fyear, DATEADD(DAY, Number, DATEADD(DAY, -1, fy.fstart)) AS fydate
FROM Common.NUMBERS
CROSS APPLY (SELECT * FROM #fy WHERE fyear = 2013) fy
WHERE fy.fend >= DATEADD(DAY, Number, DATEADD(DAY, -1, fy.fstart));
WITH weekcalc AS
(
SELECT DISTINCT DATEPART(YEAR, fydate) yr, DATEPART(week, fydate) dt
FROM #fylist
),
ridcalc AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY yr, dt) AS rid, yr, dt
FROM weekcalc
)
UPDATE #fylist
SET fyweek = rid
FROM #fylist
JOIN ridcalc
ON DATEPART(YEAR, fydate) = yr
AND DATEPART(week, fydate) = dt;
SELECT list.fyyear, list.fyweek, p.[date], COUNT(bread) AS Bread, COUNT(Milk) AS Milk
FROM products p
JOIN #fylist list
ON p.[date] = list.fydate
GROUP BY list.fyyear, list.fyweek, p.[date] WITH ROLLUP;
The Common.Numbers reference above is a simple numbers table that I use for this sort of thing (goes from 1 to 1M). You could also build that on the fly as needed.

Fetch monthly records by total and by detail from single query in SQL Server 2005

I am working with SQL Server 2005.
I want to fetch monthlyTotalAppoinment and monthlyEmployeewiseTotal from appointment table in single result.
Appointment Table
appoinmentId
appoinmentDate
employeeId
I can successfully fetch monthlyTotalAppoinment and also got employeewisetotaappoinment from following query, but I want monthly employeewiseappoinment.
SELECT *
FROM (SELECT Datename(M, Dateadd(M, NUMBER - 1, 0)) AS month
FROM MASTER..SPT_VALUES
WHERE TYPE = 'p'
AND NUMBER BETWEEN 1 AND 12) months
LEFT JOIN (SELECT Datename(MM, APPOINMENTDATE) month,
Count(APPOINMENTID) AS TotalAppointment
FROM APPOINTMENT
GROUP BY Datename(MM, APPOINMENTDATE)) appoinment
ON months.MONTH = appoinment.MONTH
I am getting following output.
but I want following output
appoinementId employeeId appoinemntDate
------------- ----------- ---------------
1 4 8/25/2012 12:00:00 AM
2 4 8/25/2012 12:00:00 AM
3 4 8/25/2012 12:00:00 AM
4 4 8/25/2012 12:00:00 AM
5 4 8/25/2012 12:00:00 AM
6 4 9/25/2012 12:00:00 AM
7 2 9/25/2012 12:00:00 AM
8 2 9/25/2012 12:00:00 AM
9 2 9/25/2012 12:00:00 AM
10 4 9/25/2012 12:00:00 AM
11 4 10/25/2012 12:00:00 AM
12 2 10/25/2012 12:00:00 AM
13 4 10/25/2012 12:00:00 AM
for above data cuming output(For EmployeeId 4)
Month MonthData Totalappoinemnt TotalEmployeewiseAppointmemt
------------- ----------- -------------- ------------------------------
January.. NULL.. NULL.. NULL..
Augest Augest 5 9
September September 5 9
October October 3 9
But i want following
Month MonthData Totalappoinemnt TotalEmployeewiseAppointmemt
------------- ----------- -------------- ------------------------------
January.. NULL.. NULL.. NULL..
Augest Augest 5 5
September September 5 2
October October 3 2
I'm missing some minor points in your question, but the big issues are dealt with in this query:
SELECT t1.*,
t2.EMP_COUNT
FROM (SELECT Datename(MONTH, APPOINEMNTDATE) Month_Name,
Count(*) app_count
FROM APPOINTMENTTABLE
GROUP BY Datename(MONTH, APPOINEMNTDATE))T1
LEFT JOIN (SELECT Count(*) emp_count,
Datename(MONTH, APPOINEMNTDATE) Month_Name
FROM APPOINTMENTTABLE
WHERE EMPLOYEEID = 4
GROUP BY Datename(MONTH, APPOINEMNTDATE))T2
ON t1.MONTH_NAME = t2.MONTH_NAME
A working example can be found here.
What is missing?
Couldn't figure out why you had 2 columns for months. If there is a reason for this I'll revise the code.
I only listed months with details available. I saw that January was also in the example. If you want all 12 months to show even if no data is available, let me know and it will be added.
Didn't use the exact same column names. I'm sure you can change them if you need to :-)

Summarize days per month based on date ranges

Being primarily a C# developer, I'm scratching my head when trying to create a pure T-SQL based solution to a problem involving summarizing days/month given a set of date ranges.
I have a set of data looking something like this:
UserID Department StartDate EndDate
====== ========== ========== ==========
1 A 2011-01-02 2011-01-05
1 A 2011-01-20 2011-01-25
1 A 2011-02-25 2011-03-05
1 B 2011-01-21 2011-01-22
2 A 2011-01-01 2011-01-20
3 C 2011-01-01 2011-02-03
The date ranges are non-overlapping, may span several months, there may exist several ranges for a specific user and department within a single month.
What I would like to do is to summarize the number of days (inclusive) per user, department, year and month, like this (with reservations for any math errors in my example...):
UserID Department Year Month Days
====== ========== ==== ===== ====
1 A 2011 01 10
1 A 2011 02 4
1 A 2011 03 5
1 B 2011 01 2
2 A 2011 01 20
3 C 2011 01 31
3 C 2011 02 3
This data is going into a new table used by reporting tools.
I hope the problem description is clear enough, this is my first posting here, be gentle :-)
Thanks in advance!
Working sample
-- sample data in a temp table
declare #t table (UserID int, Department char(1), StartDate datetime, EndDate datetime)
insert #t select
1 ,'A', '2011-01-02','2011-01-05'union all select
1 ,'A', '2011-01-20','2011-01-25'union all select
1 ,'A', '2011-02-25','2011-03-05'union all select
1 ,'B', '2011-01-21','2011-01-22'union all select
2 ,'A', '2011-01-01','2011-01-20'union all select
3 ,'C', '2011-01-01','2011-02-03'
-- the query you need is below this line
select UserID, Department,
YEAR(StartDate+v.number) Year,
MONTH(StartDate+v.number) Month, COUNT(*) Days
from #t t
inner join master..spt_values v
on v.type='P' and v.number <= DATEDIFF(d, startdate, enddate)
group by UserID, Department, YEAR(StartDate+v.number), MONTH(StartDate+v.number)
order by UserID, Department, Year, Month