Group Dates By Week: Open, Closed, Active - sql

Here is my scenario. I have a table like this:
------------------------------------------------------------------
| ticket | start date | finish date |
------------------------------------------------------------------
| 123 | 1 apr 12 | 20 apr 12 |
| 124 | 4 apr 12 | 28 apr 12 |
| 125 | 16 apr 12 | NULL |
| 126 | 28 apr 12 | 4 may 12 |
| 127 | 2 may 12 | NULL |
------------------------------------------------------------------
And I need to get a result set like this:
------------------------------------------------------------------
| week | opened | closed | active |
------------------------------------------------------------------
| 5 | 3 | 2 | 50 |
| 6 | 4 | 5 | 49 |
| 7 | 2 | 6 | 45 |
| 8 | 5 | 4 | 46 |
------------------------------------------------------------------
Basically, I want to see how many tickets were opened in a given week, how many were closed, and how many were simply active during that week (opened previously, and not closed yet).
I think I may have figured out how to derive the opened and closed, but I am really having trouble querying the active column out of this. Any ideas?
UPDATE: Adding info as requested.
This is for SQL.
The query I have thus far is like this, it doesn't include active yet:
SELECT
a.week,
a.created,
b.closed
FROM
(
SELECT
DATEPART(WEEK, t.[start date]) AS week,
COUNT(t.[ticket]) AS opened
FROM sqltable AS t
WHERE [start date] > GETDATE() - 69 -- last 10 weeks
GROUP BY
DATEPART(WEEK, t.[start date])
) AS a
LEFT JOIN
(
SELECT
DATEPART(WEEK, t.[finish date]) AS week,
COUNT(t.[ticket]) AS closed
FROM sqltable AS t
WHERE [finish date] > GETDATE() - 69 -- last 10 weeks
GROUP BY
DATEPART(WEEK, t.[finish date])
) AS b
ON a.week = b.week
ORDER BY a.week;

In any database, you can do this with a correlated subquery. However, date functions are not the same across databases, so let me assume you know how to do this.
select weeknum, sum(startweek) as starts, sum(endweek) as ends,
(select count(*) as numstarts
from t ts
where DATEPART(WEEK, ts.[start date]) <= weeknum and
datepart(week, ts.[end date]) >= weeknum
) as actives
from ((select DATEPART(WEEK, t.[start date]) AS weeknum, 1 as startweek, 0 endweek,
t.[start date] as startdate, NULL as enddate
from t
) union all
(select DATEPART(WEEK, t.[end date]) AS weeknum, 0 as startweek, 1 as endweek,
NULL, t.[end date] as enddate
from t
)
) t
group by weeknum
order by 1
If you are using SQL Server 2012 (or Oracle), then you can also do this with cumulative sums.

There is a easy way to do this using IF statement and generating +1 for each true result of the cumulative sum :
SELECT
DATE_FORMAT(startdate, "%U") as week,
SUM(IF(startdate<>'', 1, 0)) as opened,
SUM(IF(enddate<>'', 1, 0)) as closed,
SUM(IF(startdate<>'' AND enddate<>'', 0, 1)) as active
FROM ticket
GROUP BY week;
The query is much simple hope it will helps
Tried this in MySQL 5.6 the SQLFiddle

Related

How to calculate occurrence depending on months/years

My table looks like that:
ID | Start | End
1 | 2010-01-02 | 2010-01-04
1 | 2010-01-22 | 2010-01-24
1 | 2011-01-31 | 2011-02-02
2 | 2012-05-02 | 2012-05-08
3 | 2013-01-02 | 2013-01-03
4 | 2010-09-15 | 2010-09-20
4 | 2010-09-30 | 2010-10-05
I'm looking for a way to count the number of occurrences for each ID in a Year per Month.
But what is important, If some record has a Start date in the following month compared to the End date (of course from the same year) then occurrence should be counted for both months [e.g. ID 1 in the 3rd row has a situation like that. So in this situation, the occurrence for this ID should be +1 for January and +1 for February].
So I'd like to have it in this way:
Year | Month | Id | Occurrence
2010 | 01 | 1 | 2
2010 | 09 | 4 | 2
2010 | 10 | 4 | 1
2011 | 01 | 1 | 1
2011 | 02 | 1 | 1
2012 | 05 | 2 | 1
2013 | 01 | 3 | 1
I created only this for now...
CREATE TABLE IF NOT EXISTS counts AS
(SELECT
id,
YEAR (CAST(Start AS DATE)) AS Year_St,
MONTH (CAST(Start AS DATE)) AS Month_St,
YEAR (CAST(End AS DATE)) AS Year_End,
MONTH (CAST(End AS DATE)) AS Month_End
FROM source)
And I don't know how to move with that further. I'd appreciate your help.
I'm using Spark SQL.
Try the following strategy to achieve this:
Note:
I have created few intermediate tables. If you wish you can use sub-query or CTE depending on the permissions
I have taken care of 2 scenarios you mentioned (whether to count it as 1 occurrence or 2 occurrence) as you explained
Query:
Firstly, creating a table with flags to decide whether start and end date are falling on same year and month (1 means YES, 2 means NO):
/* Creating a table with flags whether to count the occurrences once or twice */
CREATE TABLE flagged as
(
SELECT *,
CASE
WHEN Year_st = Year_end and Month_st = Month_end then 1
WHEN Year_st = Year_end and Month_st <> Month_end then 2
Else 0
end as flag
FROM
(
SELECT
id,
YEAR (CAST(Start AS DATE)) AS Year_St,
MONTH (CAST(Start AS DATE)) AS Month_St,
YEAR (CAST(End AS DATE)) AS Year_End,
MONTH (CAST(End AS DATE)) AS Month_End
FROM source
) as calc
)
Now the flag in the above table will have 1 if year and month are same for start and end 2 if month differs. You can have more categories of flag if you have more scenarios.
Secondly, counting the occurrences for flag 1. As we know year and month are same for flag 1, we can take either of it. I have taken start:
/* Counting occurrences only for flag 1 */
CREATE TABLE flg1 as (
SELECT distinct id, year_st, month_st, count(*) as occurrence
FROM flagged
where flag=1
GROUP BY id, year_st, month_st
)
Similarly, counting the occurrences for flag 2. Since month differs for both the dates, we can UNION them before counting to get both the dates in same column:
/* Counting occurrences only for flag 2 */
CREATE TABLE flg2 as
(
SELECT distinct id, year_dt, month_dt, count(*) as occurrence
FROM
(
select ID, year_st as year_dt, month_st as month_dt FROM flagged where flag=2
UNION
SELECT ID, year_end as year_dt, month_end as month_dt FROM flagged where flag=2
) as unioned
GROUP BY id, year_dt, month_dt
)
Finally, we just have to SUM the occurrences from both the flags. Note that we use UNION ALL here to combine both the tables. This is very important because we need to count duplicates as well:
/* UNIONING both the final tables and summing the occurrences */
SELECT distinct year, month, id, SUM(occurrence) as occurrence
FROM
(
SELECT distinct id, year_st as year, month_st as month, occurrence
FROM flg1
UNION ALL
SELECT distinct id, year_dt as year, month_dt as month, occurrence
FROM flg2
) as fin_unioned
GROUP BY id, year, month
ORDER BY year, month, id, occurrence desc
Output of above query will be your expected output. I know this is not an optimized one, yet it works perfect. I will update if I come across optimized strategy. Comment if you have question.
db<>fiddle link here
Not sure if this works in Spark SQL.
But if the ranges aren't bigger than 1 month, then just add the extra to the count via a UNION ALL.
And the extra are those with the end in a higher month than the start.
SELECT YearOcc, MonthOcc, Id
, COUNT(*) as Occurrence
FROM
(
SELECT Id
, YEAR(CAST(Start AS DATE)) as YearOcc
, MONTH(CAST(Start AS DATE)) as MonthOcc
FROM source
UNION ALL
SELECT Id
, YEAR(CAST(End AS DATE)) as YearOcc
, MONTH(CAST(End AS DATE)) as MonthOcc
FROM source
WHERE MONTH(CAST(Start AS DATE)) < MONTH(CAST(End AS DATE))
) q
GROUP BY YearOcc, MonthOcc, Id
ORDER BY YearOcc, MonthOcc, Id
YearOcc | MonthOcc | Id | Occurrence
------: | -------: | -: | ---------:
2010 | 1 | 1 | 2
2010 | 9 | 4 | 2
2010 | 10 | 4 | 1
2011 | 1 | 1 | 1
2011 | 2 | 1 | 1
2012 | 5 | 2 | 1
2013 | 1 | 3 | 1
db<>fiddle here

How to perform group by in SQL Server for specific output

I have a table with few records, I want to get month wise data along with count on one of the column. The output should contain Month and count of Isregistered flag.
Table structure
| Inserted On | IsRegistered |
+-------------+--------------+
| 10-01-2020 | 1 |
| 15-01-2020 | 1 |
| 17-01-2020 | null |
| 17-02-2020 | 1 |
| 21-02-2020 | null |
| 04-04-2020 | null |
| 18-04-2020 | null |
| 19-04-2020 | 1 |
Excepted output
| Inserted On | Registered | Not Registered
+-------------+------------+---------------
| Jan | 2 | 1
| Feb | 1 | 1
| Apr | 1 | 2
I tried by performing normal group by but didn't got desired output
SELECT
DATENAME(MONTH, dateinserted) AS [MonthName], COUNT(ISRegistered)
FROM
tablename
GROUP BY
(DATENAME(MONTH, dateinserted))
Note: here null is treated as not registered
You can use aggregation. I would include the year and use the month number rather than name, so:
select year(inserted_on), month(inserted_on),
coalesce(sum(is_registered), 0) as num_registered,
sum(case when is_registered is null then 1 else 0 end) as num_not_registered
from tablename
group by year(inserted_on), month(inserted_on)
order by year(inserted_on), month(inserted_on);
Note: If you really want the monthname and want to combine data from different years (which seems unlikely, but . . . ), then you can use:
select datename(month, inserted_on),
coalesce(sum(is_registered), 0) as num_registered,
sum(case when is_registered is null then 1 else 0 end) as num_not_registered
from tablename
group by datename(month, inserted_on)
order by month(min(inserted_on));
The GROUP BY should include both the year and month (so there's no overlapping) as well as the DATENAME (for display). Something like this
drop table if exists #tablename;
go
create table #tablename(dateinserted date, ISRegistered int);
insert #tablename values
('2020-12-01', 0),
('2020-11-02', 1),
('2020-11-03', 1),
('2020-12-01', 1),
('2020-12-03', 1),
('2020-11-02', 0);
select year(dateinserted) yr,
datename(month, dateinserted) AS [MonthName],
sum(ISRegistered) Registered ,
sum(1-ISRegistered) [Not Registered]
from #tablename
group by year(dateinserted), month(dateinserted), datename(month, dateinserted)
order by year(dateinserted), month(dateinserted);
yr MonthName Registered Not Registered
2020 November 2 1
2020 December 2 1

How to calculate number of Orders for given date range

Need some help in writing sql query to capture number of active Orders between date range on month wise grouping. i.e if the user selected 2018-01-01 to 2019-12-31, I have to show number of active orders on a month basis i.e total 12 records.
I'm querying against Order Table whose schema looks like below
OrderID CustomerFirstName PurchaseDate OrderEndDate
1 XYZ 2018-01-01 9999-12-31
2 ABC 2018-02-02 2018-06-30
3 PQR 2018-06-01 2018-06-30
4 GHI 2018-01-01 2018-02-28
Order EndDate 9999-12-31 is never ending order.All considers has existing order in all date ranges.
From My UX, if I select Jan to Dec... Results should
JAN ==> 2 orders
Feb ==> 3 Orders => Order ID are 1,2,4.
Reason for Month FEB Order ID : 1,2,4 are consider as Active orders because
their end dates are falling in FEB.
For example : ORDER ID : 1 having End date has 9999-12-31 which is never ending. Always Active orders in all the date range
Order ID : 2 having End Date has 2018-06-30 so till June he should be considered has Active order for every Month
Order ID : 4 having end date has 2018-02-28 for Feb month OrderID is active Orders
Expected Output
Month NoOfOrders
Jan 2
Feb 3
Month NoOfOrder
Jan 2
Feb 3
Create a year-month table (inspired from this answer) and join the Order table against it
DECLARE #DateFrom datetime, #DateTo Datetime
SET #DateFrom = ' 2018-01-01'
SET #DateTo = '2018-12-31'
SELECT YearMonth, COUNT(*)
FROM (SELECT CONVERT(CHAR(4),DATEADD(MONTH, x.number, #DateFrom),120) + '-' + CONVERT(CHAR(2),DATEADD(MONTH, x.number, #DateFrom),110) As YearMonth,
CONVERT(DATE, CONVERT(CHAR(4),DATEADD(MONTH, x.number, #DateFrom),120) + '-' + Convert(CHAR(2),DATEADD(MONTH, x.number, #DateFrom),110) + '-01', 23) fulldate
FROM master.dbo.spt_values x
WHERE x.type = 'P'
AND x.number <= DATEDIFF(MONTH, #DateFrom, #DateTo)) YearMonthTbl
LEFT JOIN Orders o ON YEAR(fulldate) >= YEAR(purchaseDate) AND MONTH(fulldate) >= MONTH(purchaseDate) and fulldate < = enddate
GROUP BY YearMonth
I decided to include also year in output if the input range crosses into a new year
Here is the output for completeness
2018-01 2
2018-02 3
2018-03 2
2018-04 2
2018-05 2
2018-06 3
2018-07 1
2018-08 1
2018-09 1
2018-10 1
2018-11 1
2018-12 1
First Part - Handling records with orderenddate= '9999-12-31'
You can try like following. By putting a OR condition for orderenddate = '9999-12-31' will make sure that never ending records will eppear in all the searchs if the strat date is within the boundary.
SELECT *
FROM [order]
WHERE purchasedate >= #startdate
AND ( orderenddate <= #enddate
OR orderenddate = '9999-12-31' )
Second Part :
sql query to capture number of active Orders between date range on
month wise grouping.
For month wise grouping you can try like following.
;WITH numbersequence( number )
AS (SELECT 1 AS Number
UNION ALL
SELECT number + 1
FROM numbersequence
WHERE number < 12)
SELECT Sum(ct) ActiveOrderCount,
number AS [month]
FROM (SELECT number,
CASE
WHEN c.number >= Month(purchasedate)
AND c.number <= Month(orderenddate) THEN 1
ELSE 0
END ct
FROM #order
CROSS JOIN numbersequence c
WHERE purchasedate >= #startdate
AND ( orderenddate <= #enddate
OR orderenddate = '9999-12-31' )) t
GROUP BY number
Online Demo
Output
+------------------+-------+
| ActiveOrderCount | Month |
+------------------+-------+
| 2 | 1 |
+------------------+-------+
| 3 | 2 |
+------------------+-------+
| 2 | 3 |
+------------------+-------+
| 2 | 4 |
+------------------+-------+
| 2 | 5 |
+------------------+-------+
| 3 | 6 |
+------------------+-------+
| 1 | 7 |
+------------------+-------+
| 1 | 8 |
+------------------+-------+
| 1 | 9 |
+------------------+-------+
| 1 | 10 |
+------------------+-------+
| 1 | 11 |
+------------------+-------+
| 1 | 12 |
+------------------+-------+
Assumption : Start Date and End Date falls under same year. Otherwise you need to put year condition also.

How to add rolling 7 and 30 day columns to my daily count of distinct logins in SQL Server

I have half of my query that outputs the total distinct users logging in in my website for each day. But I need my third and fourth column to show the rolling week and month activity for my users.
DECLARE #StartDate AS Date = DATEADD(dd,-31,GETDATE())
SELECT CAST(ml.login AS Date) AS Date_Login
,COUNT(DISTINCT ml.email) AS Total
FROM database.members_log AS ml
WHERE 1=1
AND ml.login > #StartDate
GROUP BY CAST(ml.login AS Date)
ORDER BY CAST(ml.login AS Date) DESC
How could I complement my code to include 7-day & 30-day rolling count of distinct users
In other words: the unique amount of users who logged in within a given amount of time (Daily, Last 7 days, Last 30 days)
Not sure if this is what you're going for, but you can use window functions for rolling totals/counts. For example, if you wanted to keep your report of count by day, but also count by rolling week and month, you could do something like the following (using an intermediate CTE):
declare #StartDate AS Date = DATEADD(day, -31, getdate());
WITH
-- this is your original query, with the ISO week and month number added.
members_log_aggr(login_date, year_nbr, iso_week_nbr, month_nbr, email_count) AS
(
SELECT
CAST(ml.login AS Date),
DATEPART(YEAR, ml.login),
DATEPART(ISO_WEEK, ml.login),
DATEPART(MONTH, ml.login),
COUNT(DISTINCT ml.email) AS Total
FROM members_log AS ml
WHERE
ml.login > #StartDate
GROUP BY
CAST(ml.login AS Date),
DATEPART(YEAR, ml.login),
DATEPART(ISO_WEEK, ml.login),
DATEPART(MONTH, ml.login)
)
-- here, we use window functions for a rolling total of email count.
SELECT *,
SUM(email_count) OVER
(
PARTITION BY year_nbr, iso_week_nbr
ORDER BY login_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS count_by_week,
SUM(email_count) OVER
(
PARTITION BY year_nbr, month_nbr
ORDER BY login_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as count_by_month
FROM members_log_aggr
giving you this data:
+------------+----------+--------------+-----------+-------------+---------------+----------------+
| login_date | year_nbr | iso_week_nbr | month_nbr | email_count | count_by_week | count_by_month |
+------------+----------+--------------+-----------+-------------+---------------+----------------+
| 2018-12-12 | 2018 | 50 | 12 | 1 | 6 | 7 |
| 2018-12-13 | 2018 | 50 | 12 | 1 | 6 | 7 |
| 2018-12-14 | 2018 | 50 | 12 | 1 | 6 | 7 |
| 2018-12-15 | 2018 | 50 | 12 | 1 | 6 | 7 |
| 2018-12-16 | 2018 | 50 | 12 | 2 | 6 | 7 |
| 2018-12-19 | 2018 | 51 | 12 | 1 | 1 | 7 |
| 2019-01-13 | 2019 | 2 | 1 | 2 | 2 | 3 |
| 2019-01-21 | 2019 | 4 | 1 | 1 | 1 | 3 |
+------------+----------+--------------+-----------+-------------+---------------+----------------+
A couple of additional notes:
Your original query has 1=1 in your WHERE clause. You don't need that.
There's no need to use abbreviations in your DATEADD function (or other date functions) For example, DATEADD(DAY, -31, GETDATE()) is more clearer and just as performant as DATEADD(DD, -31, GETDATE())
It might be a good idea to replace GETDATE() with CURRENT_TIMESTAMP. They're the same function, but CURRENT_TIMESTAMP is a SQL standard.
Perhaps "Conditional aggregates" can be used for this (basically just put a case expression inside an aggregate function) e.g.
DECLARE #StartDate AS date = DATEADD( dd, -31, GETDATE() )
SELECT
CAST( ml.login AS date ) AS Date_Login
, COUNT( DISTINCT CASE
WHEN CAST( ml.login AS date ) >= DATEADD( dd, -7, CAST( GETDATE() AS date ) ) THEN ml.email
END ) AS in_week
, COUNT( DISTINCT ml.email ) AS Total
FROM dbo.members_log AS ml
WHERE 1 = 1
AND ml.login > #StartDate
GROUP BY
CAST( ml.login AS date )
ORDER BY
CAST( ml.login AS date ) DESC
But as you are already filtering for just the past 31 days, I'm not sure what you mean by "rolling" week or "rolling" month.
count(distinct) is quite tricky -- particularly for rolling averages. If you are really looking for the unique users over a time span (rather than just the average of the daily unique visitors), then I think apply may be the simplest approach:
with d as (
select cast(ml.login AS Date) AS Date_Login,
count(distinct ml.email) AS Total
from database.members_log ml
where ml.login > #StartDate
group by CAST(ml.login AS Date)
)
select t.date_login, t.total, t7.total_7d, t30.total_30d
from t outer apply
(select count(distinct ml2.email) as total_7d
from database.members_log ml2
where ml2.login <= dateadd(day, 1, t.date_login) and
ml2.login > dateadd(day, -7, t.date_login)
) t7 outer apply
(select count(distinct ml2.email) as total_30d
from database.members_log ml2
where ml2.login <= dateadd(day, 1, t.date_login) and
ml2.login > dateadd(day, -30, t.date_login)
) t30
order by date_login desc;
The date arithmetic is my best understanding of what you mean by the rolling averages. It includes the current day, but not the day days ago.

Get Monthly Totals from Running Totals

I have a table in a SQL Server 2008 database with two columns that hold running totals called Hours and Starts. Another column, Date, holds the date of a record. The dates are sporadic throughout any given month, but there's always a record for the last hour of the month.
For example:
ContainerID | Date | Hours | Starts
1 | 2010-12-31 23:59 | 20 | 6
1 | 2011-01-15 00:59 | 23 | 6
1 | 2011-01-31 23:59 | 30 | 8
2 | 2010-12-31 23:59 | 14 | 2
2 | 2011-01-18 12:59 | 14 | 2
2 | 2011-01-31 23:59 | 19 | 3
How can I query the table to get the total number of hours and starts for each month between two specified years? (In this case 2011 and 2013.) I know that I need to take the values from the last record of one month and subtract it by the values from the last record of the previous month. I'm having a hard time coming up with a good way to do this in SQL, however.
As requested, here are the expected results:
ContainerID | Date | MonthlyHours | MonthlyStarts
1 | 2011-01-31 23:59 | 10 | 2
2 | 2011-01-31 23:59 | 5 | 1
Try this:
SELECT c1.ContainerID,
c1.Date,
c1.Hours-c3.Hours AS "MonthlyHours",
c1.Starts - c3.Starts AS "MonthlyStarts"
FROM Containers c1
LEFT OUTER JOIN Containers c2 ON
c1.ContainerID = c2.ContainerID
AND datediff(MONTH, c1.Date, c2.Date)=0
AND c2.Date > c1.Date
LEFT OUTER JOIN Containers c3 ON
c1.ContainerID = c3.ContainerID
AND datediff(MONTH, c1.Date, c3.Date)=-1
LEFT OUTER JOIN Containers c4 ON
c3.ContainerID = c4.ContainerID
AND datediff(MONTH, c3.Date, c4.Date)=0
AND c4.Date > c3.Date
WHERE
c2.ContainerID is null
AND c4.ContainerID is null
AND c3.ContainerID is not null
ORDER BY c1.ContainerID, c1.Date
Using recursive CTE and some 'creative' JOIN condition, you can fetch next month's value for each ContainterID:
WITH CTE_PREP AS
(
--RN will be 1 for last row in each month for each container
--MonthRank will be sequential number for each subsequent month (to increment easier)
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY ContainerID, YEAR(Date), MONTH(DATE) ORDER BY Date DESC) RN
,DENSE_RANK() OVER (ORDER BY YEAR(Date),MONTH(Date)) MonthRank
FROM Table1
)
, RCTE AS
(
--"Zero row", last row in decembar 2010 for each container
SELECT *, Hours AS MonthlyHours, Starts AS MonthlyStarts
FROM CTE_Prep
WHERE YEAR(date) = 2010 AND MONTH(date) = 12 AND RN = 1
UNION ALL
--for each next row just join on MonthRank + 1
SELECT t.*, t.Hours - r.Hours, t.Starts - r.Starts
FROM RCTE r
INNER JOIN CTE_Prep t ON r.ContainerID = t.ContainerID AND r.MonthRank + 1 = t.MonthRank AND t.Rn = 1
)
SELECT ContainerID, Date, MonthlyHours, MonthlyStarts
FROM RCTE
WHERE Date >= '2011-01-01' --to eliminate "zero row"
ORDER BY ContainerID
SQLFiddle DEMO (I have added some data for February and March in order to test on different lengths of months)
Old version fiddle