Need some help in writing sql query to capture number of active Orders between date range on month wise grouping. i.e if the user selected 2018-01-01 to 2019-12-31, I have to show number of active orders on a month basis i.e total 12 records.
I'm querying against Order Table whose schema looks like below
OrderID CustomerFirstName PurchaseDate OrderEndDate
1 XYZ 2018-01-01 9999-12-31
2 ABC 2018-02-02 2018-06-30
3 PQR 2018-06-01 2018-06-30
4 GHI 2018-01-01 2018-02-28
Order EndDate 9999-12-31 is never ending order.All considers has existing order in all date ranges.
From My UX, if I select Jan to Dec... Results should
JAN ==> 2 orders
Feb ==> 3 Orders => Order ID are 1,2,4.
Reason for Month FEB Order ID : 1,2,4 are consider as Active orders because
their end dates are falling in FEB.
For example : ORDER ID : 1 having End date has 9999-12-31 which is never ending. Always Active orders in all the date range
Order ID : 2 having End Date has 2018-06-30 so till June he should be considered has Active order for every Month
Order ID : 4 having end date has 2018-02-28 for Feb month OrderID is active Orders
Expected Output
Month NoOfOrders
Jan 2
Feb 3
Month NoOfOrder
Jan 2
Feb 3
Create a year-month table (inspired from this answer) and join the Order table against it
DECLARE #DateFrom datetime, #DateTo Datetime
SET #DateFrom = ' 2018-01-01'
SET #DateTo = '2018-12-31'
SELECT YearMonth, COUNT(*)
FROM (SELECT CONVERT(CHAR(4),DATEADD(MONTH, x.number, #DateFrom),120) + '-' + CONVERT(CHAR(2),DATEADD(MONTH, x.number, #DateFrom),110) As YearMonth,
CONVERT(DATE, CONVERT(CHAR(4),DATEADD(MONTH, x.number, #DateFrom),120) + '-' + Convert(CHAR(2),DATEADD(MONTH, x.number, #DateFrom),110) + '-01', 23) fulldate
FROM master.dbo.spt_values x
WHERE x.type = 'P'
AND x.number <= DATEDIFF(MONTH, #DateFrom, #DateTo)) YearMonthTbl
LEFT JOIN Orders o ON YEAR(fulldate) >= YEAR(purchaseDate) AND MONTH(fulldate) >= MONTH(purchaseDate) and fulldate < = enddate
GROUP BY YearMonth
I decided to include also year in output if the input range crosses into a new year
Here is the output for completeness
2018-01 2
2018-02 3
2018-03 2
2018-04 2
2018-05 2
2018-06 3
2018-07 1
2018-08 1
2018-09 1
2018-10 1
2018-11 1
2018-12 1
First Part - Handling records with orderenddate= '9999-12-31'
You can try like following. By putting a OR condition for orderenddate = '9999-12-31' will make sure that never ending records will eppear in all the searchs if the strat date is within the boundary.
SELECT *
FROM [order]
WHERE purchasedate >= #startdate
AND ( orderenddate <= #enddate
OR orderenddate = '9999-12-31' )
Second Part :
sql query to capture number of active Orders between date range on
month wise grouping.
For month wise grouping you can try like following.
;WITH numbersequence( number )
AS (SELECT 1 AS Number
UNION ALL
SELECT number + 1
FROM numbersequence
WHERE number < 12)
SELECT Sum(ct) ActiveOrderCount,
number AS [month]
FROM (SELECT number,
CASE
WHEN c.number >= Month(purchasedate)
AND c.number <= Month(orderenddate) THEN 1
ELSE 0
END ct
FROM #order
CROSS JOIN numbersequence c
WHERE purchasedate >= #startdate
AND ( orderenddate <= #enddate
OR orderenddate = '9999-12-31' )) t
GROUP BY number
Online Demo
Output
+------------------+-------+
| ActiveOrderCount | Month |
+------------------+-------+
| 2 | 1 |
+------------------+-------+
| 3 | 2 |
+------------------+-------+
| 2 | 3 |
+------------------+-------+
| 2 | 4 |
+------------------+-------+
| 2 | 5 |
+------------------+-------+
| 3 | 6 |
+------------------+-------+
| 1 | 7 |
+------------------+-------+
| 1 | 8 |
+------------------+-------+
| 1 | 9 |
+------------------+-------+
| 1 | 10 |
+------------------+-------+
| 1 | 11 |
+------------------+-------+
| 1 | 12 |
+------------------+-------+
Assumption : Start Date and End Date falls under same year. Otherwise you need to put year condition also.
Related
My table looks like that:
ID | Start | End
1 | 2010-01-02 | 2010-01-04
1 | 2010-01-22 | 2010-01-24
1 | 2011-01-31 | 2011-02-02
2 | 2012-05-02 | 2012-05-08
3 | 2013-01-02 | 2013-01-03
4 | 2010-09-15 | 2010-09-20
4 | 2010-09-30 | 2010-10-05
I'm looking for a way to count the number of occurrences for each ID in a Year per Month.
But what is important, If some record has a Start date in the following month compared to the End date (of course from the same year) then occurrence should be counted for both months [e.g. ID 1 in the 3rd row has a situation like that. So in this situation, the occurrence for this ID should be +1 for January and +1 for February].
So I'd like to have it in this way:
Year | Month | Id | Occurrence
2010 | 01 | 1 | 2
2010 | 09 | 4 | 2
2010 | 10 | 4 | 1
2011 | 01 | 1 | 1
2011 | 02 | 1 | 1
2012 | 05 | 2 | 1
2013 | 01 | 3 | 1
I created only this for now...
CREATE TABLE IF NOT EXISTS counts AS
(SELECT
id,
YEAR (CAST(Start AS DATE)) AS Year_St,
MONTH (CAST(Start AS DATE)) AS Month_St,
YEAR (CAST(End AS DATE)) AS Year_End,
MONTH (CAST(End AS DATE)) AS Month_End
FROM source)
And I don't know how to move with that further. I'd appreciate your help.
I'm using Spark SQL.
Try the following strategy to achieve this:
Note:
I have created few intermediate tables. If you wish you can use sub-query or CTE depending on the permissions
I have taken care of 2 scenarios you mentioned (whether to count it as 1 occurrence or 2 occurrence) as you explained
Query:
Firstly, creating a table with flags to decide whether start and end date are falling on same year and month (1 means YES, 2 means NO):
/* Creating a table with flags whether to count the occurrences once or twice */
CREATE TABLE flagged as
(
SELECT *,
CASE
WHEN Year_st = Year_end and Month_st = Month_end then 1
WHEN Year_st = Year_end and Month_st <> Month_end then 2
Else 0
end as flag
FROM
(
SELECT
id,
YEAR (CAST(Start AS DATE)) AS Year_St,
MONTH (CAST(Start AS DATE)) AS Month_St,
YEAR (CAST(End AS DATE)) AS Year_End,
MONTH (CAST(End AS DATE)) AS Month_End
FROM source
) as calc
)
Now the flag in the above table will have 1 if year and month are same for start and end 2 if month differs. You can have more categories of flag if you have more scenarios.
Secondly, counting the occurrences for flag 1. As we know year and month are same for flag 1, we can take either of it. I have taken start:
/* Counting occurrences only for flag 1 */
CREATE TABLE flg1 as (
SELECT distinct id, year_st, month_st, count(*) as occurrence
FROM flagged
where flag=1
GROUP BY id, year_st, month_st
)
Similarly, counting the occurrences for flag 2. Since month differs for both the dates, we can UNION them before counting to get both the dates in same column:
/* Counting occurrences only for flag 2 */
CREATE TABLE flg2 as
(
SELECT distinct id, year_dt, month_dt, count(*) as occurrence
FROM
(
select ID, year_st as year_dt, month_st as month_dt FROM flagged where flag=2
UNION
SELECT ID, year_end as year_dt, month_end as month_dt FROM flagged where flag=2
) as unioned
GROUP BY id, year_dt, month_dt
)
Finally, we just have to SUM the occurrences from both the flags. Note that we use UNION ALL here to combine both the tables. This is very important because we need to count duplicates as well:
/* UNIONING both the final tables and summing the occurrences */
SELECT distinct year, month, id, SUM(occurrence) as occurrence
FROM
(
SELECT distinct id, year_st as year, month_st as month, occurrence
FROM flg1
UNION ALL
SELECT distinct id, year_dt as year, month_dt as month, occurrence
FROM flg2
) as fin_unioned
GROUP BY id, year, month
ORDER BY year, month, id, occurrence desc
Output of above query will be your expected output. I know this is not an optimized one, yet it works perfect. I will update if I come across optimized strategy. Comment if you have question.
db<>fiddle link here
Not sure if this works in Spark SQL.
But if the ranges aren't bigger than 1 month, then just add the extra to the count via a UNION ALL.
And the extra are those with the end in a higher month than the start.
SELECT YearOcc, MonthOcc, Id
, COUNT(*) as Occurrence
FROM
(
SELECT Id
, YEAR(CAST(Start AS DATE)) as YearOcc
, MONTH(CAST(Start AS DATE)) as MonthOcc
FROM source
UNION ALL
SELECT Id
, YEAR(CAST(End AS DATE)) as YearOcc
, MONTH(CAST(End AS DATE)) as MonthOcc
FROM source
WHERE MONTH(CAST(Start AS DATE)) < MONTH(CAST(End AS DATE))
) q
GROUP BY YearOcc, MonthOcc, Id
ORDER BY YearOcc, MonthOcc, Id
YearOcc | MonthOcc | Id | Occurrence
------: | -------: | -: | ---------:
2010 | 1 | 1 | 2
2010 | 9 | 4 | 2
2010 | 10 | 4 | 1
2011 | 1 | 1 | 1
2011 | 2 | 1 | 1
2012 | 5 | 2 | 1
2013 | 1 | 3 | 1
db<>fiddle here
I have half of my query that outputs the total distinct users logging in in my website for each day. But I need my third and fourth column to show the rolling week and month activity for my users.
DECLARE #StartDate AS Date = DATEADD(dd,-31,GETDATE())
SELECT CAST(ml.login AS Date) AS Date_Login
,COUNT(DISTINCT ml.email) AS Total
FROM database.members_log AS ml
WHERE 1=1
AND ml.login > #StartDate
GROUP BY CAST(ml.login AS Date)
ORDER BY CAST(ml.login AS Date) DESC
How could I complement my code to include 7-day & 30-day rolling count of distinct users
In other words: the unique amount of users who logged in within a given amount of time (Daily, Last 7 days, Last 30 days)
Not sure if this is what you're going for, but you can use window functions for rolling totals/counts. For example, if you wanted to keep your report of count by day, but also count by rolling week and month, you could do something like the following (using an intermediate CTE):
declare #StartDate AS Date = DATEADD(day, -31, getdate());
WITH
-- this is your original query, with the ISO week and month number added.
members_log_aggr(login_date, year_nbr, iso_week_nbr, month_nbr, email_count) AS
(
SELECT
CAST(ml.login AS Date),
DATEPART(YEAR, ml.login),
DATEPART(ISO_WEEK, ml.login),
DATEPART(MONTH, ml.login),
COUNT(DISTINCT ml.email) AS Total
FROM members_log AS ml
WHERE
ml.login > #StartDate
GROUP BY
CAST(ml.login AS Date),
DATEPART(YEAR, ml.login),
DATEPART(ISO_WEEK, ml.login),
DATEPART(MONTH, ml.login)
)
-- here, we use window functions for a rolling total of email count.
SELECT *,
SUM(email_count) OVER
(
PARTITION BY year_nbr, iso_week_nbr
ORDER BY login_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS count_by_week,
SUM(email_count) OVER
(
PARTITION BY year_nbr, month_nbr
ORDER BY login_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as count_by_month
FROM members_log_aggr
giving you this data:
+------------+----------+--------------+-----------+-------------+---------------+----------------+
| login_date | year_nbr | iso_week_nbr | month_nbr | email_count | count_by_week | count_by_month |
+------------+----------+--------------+-----------+-------------+---------------+----------------+
| 2018-12-12 | 2018 | 50 | 12 | 1 | 6 | 7 |
| 2018-12-13 | 2018 | 50 | 12 | 1 | 6 | 7 |
| 2018-12-14 | 2018 | 50 | 12 | 1 | 6 | 7 |
| 2018-12-15 | 2018 | 50 | 12 | 1 | 6 | 7 |
| 2018-12-16 | 2018 | 50 | 12 | 2 | 6 | 7 |
| 2018-12-19 | 2018 | 51 | 12 | 1 | 1 | 7 |
| 2019-01-13 | 2019 | 2 | 1 | 2 | 2 | 3 |
| 2019-01-21 | 2019 | 4 | 1 | 1 | 1 | 3 |
+------------+----------+--------------+-----------+-------------+---------------+----------------+
A couple of additional notes:
Your original query has 1=1 in your WHERE clause. You don't need that.
There's no need to use abbreviations in your DATEADD function (or other date functions) For example, DATEADD(DAY, -31, GETDATE()) is more clearer and just as performant as DATEADD(DD, -31, GETDATE())
It might be a good idea to replace GETDATE() with CURRENT_TIMESTAMP. They're the same function, but CURRENT_TIMESTAMP is a SQL standard.
Perhaps "Conditional aggregates" can be used for this (basically just put a case expression inside an aggregate function) e.g.
DECLARE #StartDate AS date = DATEADD( dd, -31, GETDATE() )
SELECT
CAST( ml.login AS date ) AS Date_Login
, COUNT( DISTINCT CASE
WHEN CAST( ml.login AS date ) >= DATEADD( dd, -7, CAST( GETDATE() AS date ) ) THEN ml.email
END ) AS in_week
, COUNT( DISTINCT ml.email ) AS Total
FROM dbo.members_log AS ml
WHERE 1 = 1
AND ml.login > #StartDate
GROUP BY
CAST( ml.login AS date )
ORDER BY
CAST( ml.login AS date ) DESC
But as you are already filtering for just the past 31 days, I'm not sure what you mean by "rolling" week or "rolling" month.
count(distinct) is quite tricky -- particularly for rolling averages. If you are really looking for the unique users over a time span (rather than just the average of the daily unique visitors), then I think apply may be the simplest approach:
with d as (
select cast(ml.login AS Date) AS Date_Login,
count(distinct ml.email) AS Total
from database.members_log ml
where ml.login > #StartDate
group by CAST(ml.login AS Date)
)
select t.date_login, t.total, t7.total_7d, t30.total_30d
from t outer apply
(select count(distinct ml2.email) as total_7d
from database.members_log ml2
where ml2.login <= dateadd(day, 1, t.date_login) and
ml2.login > dateadd(day, -7, t.date_login)
) t7 outer apply
(select count(distinct ml2.email) as total_30d
from database.members_log ml2
where ml2.login <= dateadd(day, 1, t.date_login) and
ml2.login > dateadd(day, -30, t.date_login)
) t30
order by date_login desc;
The date arithmetic is my best understanding of what you mean by the rolling averages. It includes the current day, but not the day days ago.
2i ll listed only 1 table that i need to query :
lodgings_Contract :
id_contract indentity primary,
id_person int,
id_room varchar(4),
day_begin datetime,
day_end datetime,
day_register datetime
money_per_month money
And this is values for table lodgings_Contract (This datas used for Example only):
id_contract | id_person | id_room | day_begin -----| day_end ----- | day_register------- | money_per_month
3 | 2 | 101 | 1/12/2014 | 27/2/2015 | 1/12/2015 | 100
2 | 1 | 102 | 1/1/2014 | 27/4/2014 | 1/1/2014 | 200
1 | 3 | 103 | 1/1/2014 | 27/3/2014 | 1/1/2014 | 300
*person 1 rent room 102 in 4 month at year 2014 with 200/month And person 2 rent room 101 in 3 month but 1 month at year 2014 and 2 month at year 2015 with 100/month .Person 3 rent room 103 in 3 month at year 2014 with 300/month
I want my result display 3 field : Month | Year | Incomes
Result :
Month | Year | Incomes
1 |2014| 500
2 |2014| 500
3 |2014| 500
4 |2014| 200
12 |2014| 100
1 |2015| 100
2 |2015| 100
Can i do that ? Help me Please !
I was post another post before this post but it complicated and requires 3 tables so i make this post with only 1 table.
This is my code :
select month(day_begin)as 'Month',year(day_begin)as 'Year',money_per_month as 'Incomes'
from lodgings_Contract
group by day_begi,money_per_month
It only listed first month of "day_begin".I have no idea how to do it right
To get the results you first need a calendar table, in the following query is created on the fly with a CTE.
That said what is the purpose of the column day_register? It seems a copy of day_begin, with probably a typo for the contract with ID 3.
WITH Months(N) AS (
SELECT 1 UNION ALL Select 2 UNION ALL Select 3 UNION ALL Select 4
UNION ALL Select 5 UNION ALL Select 6 UNION ALL Select 7 UNION ALL Select 8
UNION ALL Select 9 UNION ALL Select 10 UNION ALL Select 11 UNION ALL Select 12
), Calendar(N) As (
SELECT CAST(2010 + y.N AS VARCHAR) + RIGHT('00' + Cast(m.N AS VARCHAR), 2)
FROM Months m
CROSS JOIN Months y
)
SELECT RIGHT(c.N, 2) [Month]
, LEFT(c.N, 4) [Year]
, SUM(money_per_month) Incomes
FROM lodgings_Contract lc
INNER JOIN Calendar c
ON c.N BETWEEN CONVERT(VARCHAR(6), lc.day_begin, 112)
AND CONVERT(VARCHAR(6), lc.day_end, 112)
GROUP BY c.N
The calendar CTE is small as it's unknown to me for how many year is the real data. If there are many years it is better to create a calendar table in your DB and use it instead of calculate it every time.
The calendar CTE return a list of month in the format yyyyMM.
In the main query the CONVERT(VARCHAR(6), lc.day_begin, 112) change the day_begin to the ISO format yyyyMMdd and take only the first six value, so again yyyyMM, for example for the id_contract 3 we will have 201412, the same for the day_end.
If the beginning of the contract is day_register change lc.day_begin to lc.day_register.
SQLFiddle demo
I have a table in a SQL Server 2008 database with two columns that hold running totals called Hours and Starts. Another column, Date, holds the date of a record. The dates are sporadic throughout any given month, but there's always a record for the last hour of the month.
For example:
ContainerID | Date | Hours | Starts
1 | 2010-12-31 23:59 | 20 | 6
1 | 2011-01-15 00:59 | 23 | 6
1 | 2011-01-31 23:59 | 30 | 8
2 | 2010-12-31 23:59 | 14 | 2
2 | 2011-01-18 12:59 | 14 | 2
2 | 2011-01-31 23:59 | 19 | 3
How can I query the table to get the total number of hours and starts for each month between two specified years? (In this case 2011 and 2013.) I know that I need to take the values from the last record of one month and subtract it by the values from the last record of the previous month. I'm having a hard time coming up with a good way to do this in SQL, however.
As requested, here are the expected results:
ContainerID | Date | MonthlyHours | MonthlyStarts
1 | 2011-01-31 23:59 | 10 | 2
2 | 2011-01-31 23:59 | 5 | 1
Try this:
SELECT c1.ContainerID,
c1.Date,
c1.Hours-c3.Hours AS "MonthlyHours",
c1.Starts - c3.Starts AS "MonthlyStarts"
FROM Containers c1
LEFT OUTER JOIN Containers c2 ON
c1.ContainerID = c2.ContainerID
AND datediff(MONTH, c1.Date, c2.Date)=0
AND c2.Date > c1.Date
LEFT OUTER JOIN Containers c3 ON
c1.ContainerID = c3.ContainerID
AND datediff(MONTH, c1.Date, c3.Date)=-1
LEFT OUTER JOIN Containers c4 ON
c3.ContainerID = c4.ContainerID
AND datediff(MONTH, c3.Date, c4.Date)=0
AND c4.Date > c3.Date
WHERE
c2.ContainerID is null
AND c4.ContainerID is null
AND c3.ContainerID is not null
ORDER BY c1.ContainerID, c1.Date
Using recursive CTE and some 'creative' JOIN condition, you can fetch next month's value for each ContainterID:
WITH CTE_PREP AS
(
--RN will be 1 for last row in each month for each container
--MonthRank will be sequential number for each subsequent month (to increment easier)
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY ContainerID, YEAR(Date), MONTH(DATE) ORDER BY Date DESC) RN
,DENSE_RANK() OVER (ORDER BY YEAR(Date),MONTH(Date)) MonthRank
FROM Table1
)
, RCTE AS
(
--"Zero row", last row in decembar 2010 for each container
SELECT *, Hours AS MonthlyHours, Starts AS MonthlyStarts
FROM CTE_Prep
WHERE YEAR(date) = 2010 AND MONTH(date) = 12 AND RN = 1
UNION ALL
--for each next row just join on MonthRank + 1
SELECT t.*, t.Hours - r.Hours, t.Starts - r.Starts
FROM RCTE r
INNER JOIN CTE_Prep t ON r.ContainerID = t.ContainerID AND r.MonthRank + 1 = t.MonthRank AND t.Rn = 1
)
SELECT ContainerID, Date, MonthlyHours, MonthlyStarts
FROM RCTE
WHERE Date >= '2011-01-01' --to eliminate "zero row"
ORDER BY ContainerID
SQLFiddle DEMO (I have added some data for February and March in order to test on different lengths of months)
Old version fiddle
Here is my scenario. I have a table like this:
------------------------------------------------------------------
| ticket | start date | finish date |
------------------------------------------------------------------
| 123 | 1 apr 12 | 20 apr 12 |
| 124 | 4 apr 12 | 28 apr 12 |
| 125 | 16 apr 12 | NULL |
| 126 | 28 apr 12 | 4 may 12 |
| 127 | 2 may 12 | NULL |
------------------------------------------------------------------
And I need to get a result set like this:
------------------------------------------------------------------
| week | opened | closed | active |
------------------------------------------------------------------
| 5 | 3 | 2 | 50 |
| 6 | 4 | 5 | 49 |
| 7 | 2 | 6 | 45 |
| 8 | 5 | 4 | 46 |
------------------------------------------------------------------
Basically, I want to see how many tickets were opened in a given week, how many were closed, and how many were simply active during that week (opened previously, and not closed yet).
I think I may have figured out how to derive the opened and closed, but I am really having trouble querying the active column out of this. Any ideas?
UPDATE: Adding info as requested.
This is for SQL.
The query I have thus far is like this, it doesn't include active yet:
SELECT
a.week,
a.created,
b.closed
FROM
(
SELECT
DATEPART(WEEK, t.[start date]) AS week,
COUNT(t.[ticket]) AS opened
FROM sqltable AS t
WHERE [start date] > GETDATE() - 69 -- last 10 weeks
GROUP BY
DATEPART(WEEK, t.[start date])
) AS a
LEFT JOIN
(
SELECT
DATEPART(WEEK, t.[finish date]) AS week,
COUNT(t.[ticket]) AS closed
FROM sqltable AS t
WHERE [finish date] > GETDATE() - 69 -- last 10 weeks
GROUP BY
DATEPART(WEEK, t.[finish date])
) AS b
ON a.week = b.week
ORDER BY a.week;
In any database, you can do this with a correlated subquery. However, date functions are not the same across databases, so let me assume you know how to do this.
select weeknum, sum(startweek) as starts, sum(endweek) as ends,
(select count(*) as numstarts
from t ts
where DATEPART(WEEK, ts.[start date]) <= weeknum and
datepart(week, ts.[end date]) >= weeknum
) as actives
from ((select DATEPART(WEEK, t.[start date]) AS weeknum, 1 as startweek, 0 endweek,
t.[start date] as startdate, NULL as enddate
from t
) union all
(select DATEPART(WEEK, t.[end date]) AS weeknum, 0 as startweek, 1 as endweek,
NULL, t.[end date] as enddate
from t
)
) t
group by weeknum
order by 1
If you are using SQL Server 2012 (or Oracle), then you can also do this with cumulative sums.
There is a easy way to do this using IF statement and generating +1 for each true result of the cumulative sum :
SELECT
DATE_FORMAT(startdate, "%U") as week,
SUM(IF(startdate<>'', 1, 0)) as opened,
SUM(IF(enddate<>'', 1, 0)) as closed,
SUM(IF(startdate<>'' AND enddate<>'', 0, 1)) as active
FROM ticket
GROUP BY week;
The query is much simple hope it will helps
Tried this in MySQL 5.6 the SQLFiddle