Unique Count of YTD per month - sql

I'm trying to get a YTD count for each of unique employees who have had any revenue in the current or preceding months
Table1
Month Employee Revenue
01-04-18 A 867
01-04-18 B
01-04-18 C
01-04-18 D
01-05-18 A 881
01-05-18 B
01-05-18 C 712
01-05-18 D
01-06-18 A 529
01-06-18 B 456
01-06-18 C
01-06-18 D 878
Expected Output
Month Count
01-04-18 1
01-05-18 2
01-06-18 4
In the 1st month only A had any revenue so the count is 1, in the 2nd month A & C had revenue till date so the count is 2 and finally in the 3rd month A, B, C & D have had revenue in the current or preceding months (C had revenue in month 2 but not month 3) so the count is 4.
Is there any way to get this result?
Thank you for your help

This is tricky, because you have an aggregation and a window function. I would go for the approach of marking the first month where a use has revenue and then using that information:
select month,
sum(sum(case when seqnum = 1 and revenue is not null then 1 else 0 end)) over (order by month)
from (select t.*,
row_number() over (partition by employee order by (case when revenue is not null then month end) nulls last) as seqnum
from t
) t
group by month;
The row_number() is enumerating the months for each employee putting the ones with revenue first. So, if there is a month with revenue, it goes first.
The outer aggregation then does a cumulative sum check both for the sequence and whether the revenue is not null.

I'd take a slightly different approach, still using an aggregate of an analytic function inside an inline view, but sticking to count() as I think the intent is slightly cleaeer:
select month,
count(has_revenue) as result
from (
select month, employee,
case when count(revenue)
over (partition by employee order by month) > 0
then employee end as has_revenue
from table1
)
group by month
For the inline view, the analytic count for each month/employee uses the default window of unbounded preceding to current row, so it ignore any rows in future months; and only gives a not-null response if that count is non-zero. The outer count ignore nulls in that generated column expression.
Demo with your sample data in a CTE:
with table1 (month, employee, revenue) as (
select date '2018-04-01', 'A', 867 from dual
union all select date '2018-04-01', 'B', null from dual
union all select date '2018-04-01', 'C', null from dual
union all select date '2018-04-01', 'D', null from dual
union all select date '2018-05-01', 'A', 881 from dual
union all select date '2018-05-01', 'B', null from dual
union all select date '2018-05-01', 'C', 712 from dual
union all select date '2018-05-01', 'D', null from dual
union all select date '2018-06-01', 'A', 529 from dual
union all select date '2018-06-01', 'B', 456 from dual
union all select date '2018-06-01', 'C', null from dual
union all select date '2018-06-01', 'D', 878 from dual
)
select month,
count(has_revenue) as result
from (
select month, employee,
case when count(revenue)
over (partition by employee order by month) > 0
then employee end as has_revenue
from table1
)
group by month
order by month;
MONTH RESULT
---------- ----------
2018-04-01 1
2018-05-01 2
2018-06-01 4
This is cumulative over all rows in your data set, but you only showed data from one year. If your data has multiple years, and you aren't filtering to a single year already, then add the year into the partitioning:
select month, employee,
case when count(revenue)
over (partition by employee, trunc(month, 'YYYY') order by month) > 0
then employee end as has_revenue
from table1

In this case I'd use a compound table expression to pull the distinct months from your table, then use COUNT(DISTINCT to count the distinct employees, using the appropriate join criteria. Or, in other words:
WITH cteMonths AS (SELECT DISTINCT MONTH
FROM TABLE1)
SELECT m.MONTH, COUNT(DISTINCT t1.EMPLOYEE)
FROM cteMonths m
INNER JOIN TABLE1 t1
ON t1.MONTH <= m.MONTH AND
t1.REVENUE IS NOT NULL
GROUP BY m.MONTH
ORDER BY m.MONTH;
SQLFiddle here
Best of luck.

Related

Aggregate values if value wasn't seen before in group - SQL / ORACLE

Trying to do this in Oracle queries but SQL works too. I'm wondering if there are any easy functions or ways to do this , in theory I know how to do this in python (see my example below)
Basically I'm trying to run a total distinct count , lets say monthly for a unique identifier lets use "customer_id" but only have them added to the total if they were not seen in prior months.
If customer 1 was seen in Jan and then again in March. They would only be in the Jan total and counted as 1.
The grand total would be the total number of unique_customers
....In python you would do a list , check to see if the customer is in the list if they are it would do nothing. If they are not they get appended to the list and then added to the sum, total. This is just overall total of unique values though and it would have to do this on a monthly total but in theory this is what I would want
l = []
total = 0
customers [12,123,1234,12345,123455]
for i in customers:
if i in l:
pass
else:
l.append(i)
total += 1
return total
Now that I'm typing this out and thinking about it more though I would do a subquery of unique customer and their min(date) of sale. Then when
select count(distinct customer_id), month
from sales
group by month
Doesnt work because each unique customer is counted by month....but if I did
select count(customer_id), month
from
(select customer_id, min(month)
from sales
group by customer_id)
group by month
that would work as it's only using the customers first sale month as the total? Is there an easier way to do this or does this make sense
You appear to want to find the first occurrence of each customer_id; you can use an analytic function for that and then filter on the first occurrence:
SELECT customer_id,
month
FROM (
SELECT customer_id,
month,
ROW_NUMBER() OVER ( PARTITION BY customer_id ORDER BY month ) AS rn
FROM sales
)
WHERE rn = 1;
Which, for the sample data:
CREATE TABLE sales ( customer_id, month ) AS
SELECT 1, DATE '2021-01-01' FROM DUAL UNION ALL
SELECT 1, DATE '2021-02-01' FROM DUAL UNION ALL
SELECT 1, DATE '2021-03-01' FROM DUAL UNION ALL
SELECT 1, DATE '2021-04-01' FROM DUAL UNION ALL
SELECT 1, DATE '2021-05-01' FROM DUAL UNION ALL
SELECT 2, DATE '2021-01-01' FROM DUAL UNION ALL
SELECT 3, DATE '2021-03-01' FROM DUAL UNION ALL
SELECT 3, DATE '2021-04-01' FROM DUAL UNION ALL
SELECT 3, DATE '2021-05-01' FROM DUAL UNION ALL
SELECT 4, DATE '2021-04-01' FROM DUAL UNION ALL
SELECT 4, DATE '2021-05-01' FROM DUAL;
Outputs:
CUSTOMER_ID | MONTH
----------: | :--------
1 | 01-JAN-21
2 | 01-JAN-21
3 | 01-MAR-21
4 | 01-APR-21
If you want to count, for each month, the users who have not been seen before then just take the previous query and aggregate:
SELECT COUNT(customer_id) AS number_of_new_customers,
month
FROM (
SELECT customer_id,
month,
ROW_NUMBER() OVER ( PARTITION BY customer_id ORDER BY month ) AS rn
FROM sales
)
WHERE rn = 1
GROUP BY month
ORDER BY month;
Which, for the same sample data, outputs:
NUMBER_OF_NEW_CUSTOMERS | MONTH
----------------------: | :--------
2 | 01-JAN-21
1 | 01-MAR-21
1 | 01-APR-21
db<>fiddle here

Distinct count in a column based on another column in sql

I have an employee table with two columns: emp_id and month_of_project. I want to find out the distinct count of employees involved in a project for a particular month. Meaning if the same person is involved in 3 months we will only count that person for the first month. I have mentioned sample below
emp_id month_of_project
101 Jan
102 Jan
103 Jan
101 Feb
104 Mar
102 Mar
105 Apr
103 Apr
The result should be
month count
Jan 3
Feb 0
Mar 1
Apr 1
Is there any way to achieve this in sql?
I think that you only should use GROUP BY clause
SELECT month_of_project, COUNT(emp_id)
FROM employees
GROUP BY month_of_project
You could use NOT EXISTS to only fetch records where no record in a previous month exists. Too bad you chose a textual representation for the month, not a numerical one. So you first have to translate it into numbers. You can use a CASE expression here.
SELECT t1.month_of_project,
count(*)
FROM elbat t1
WHERE NOT EXISTS (SELECT *
FROM elbat t2
WHERE CASE t2.month_of_project
WHEN 'Jan' THEN
1
...
WHEN 'Dec' THEN
12
END
<
CASE t1.month_of_project
WHEN 'Jan' THEN
1
...
WHEN 'Dec' THEN
12
END
AND t2.emp_id = t1.emp_id)
GROUP BY t1.month_of_project;
The sql is for oracle. And please don't store month like this in your real system. Use a date field. If your use case is just month then year can be arbitrary like 2000 and day can be 01 but atleaset for sorting etc. having true date always always always is the right idea.
First with is just to simulate data. Second in the with part mons is to get list of possible months since you want 0 for Feb. If you have that table outside you don't need that.
Then basically inner query finds the first month for employee as the month to use and outer query counts distinct
with emp_mon as
(
select '101' emp_id, to_date('20190101','YYYYMMDD') month_of_project from dual
union all select '102', to_date('20190101','YYYYMMDD') from dual
union all select '103', to_date('20190101','YYYYMMDD') from dual
union all select '101', to_date('20190201','YYYYMMDD') from dual
union all select '104', to_date('20190301','YYYYMMDD') from dual
union all select '102', to_date('20190301','YYYYMMDD') from dual
union all select '105', to_date('20190401','YYYYMMDD') from dual
union all select '103', to_date('20190401','YYYYMMDD') from dual
),
mons as
(
select distinct month_of_project
from emp_mon
)
select mons.month_of_project, count(distinct emp_first_mon.emp_id) cnt_emp_id
from mons
left outer join
(
select emp_id, min(month_of_project) month_of_project
from emp_mon
group by emp_id
) emp_first_mon on emp_first_mon.month_of_project = mons.month_of_project
group by mons.month_of_project
order by 1
SQL Sever COUNT DISTINCT will do exactly what you want. It's important to group on the correct column however.
WITH TEMP AS
(
SELECT 1 AS EMP, 1 AS MONTH_D
UNION ALL
SELECT 2 AS EMP, 1 AS MONTH_D
UNION ALL
SELECT 2 AS EMP, 1 AS MONTH_D
UNION ALL
SELECT 3 AS EMP, 1 AS MONTH_D
UNION ALL
SELECT 1 AS EMP, 2 AS MONTH_D
)
SELECT MONTH_D, COUNT(DISTINCT EMP) FROM TEMP
GROUP BY MONTH_D

Not able to group data according to month given in a date using SQL

I have the following set of sample data from a database
Period Company Metric Values
01/01/18 A Vol 2
02/01/18 A Vol 4
04/02/18 A Vol 5
05/02/18 B Vol 6
06/03/18 B Vol 4
07/04/18 C Vol 1
08/05/18 C Vol 6
I wish to display a total of "Values" for each company according to month.
For Example 'company A' has a total value of 6 for first month and value of 5 for second month
As a first step, I tried working as
SELECT COUNT(*) FROM `TABLE 2` WHERE DATEPART(MONTH, `Period`) = 01;
But it is throwing an error and it also does not have a group by function
Can anyone please tell how it can be done
You're using count() function instead of sum() function.
As per your requirement, the query should something similar to
SELECT COMPANY,SUM(VALUES) FROM TABLE2
GROUP BY DATEPART(MONTH, Period)
If you want the query without group by, use direct where clause.
SELECT SUM(VALUES) FROM TABLE2 WHERE DATEPART(MONTH, Period) = 01;
Try this Only for MySQL, DATEPART is not available in MySQL
SELECT MONTH(Period) as gmonth, Period, Company, SUM(Values) FROM `TABLE 2`
GROUP BY MONTH(Period);
As you tagged the question with the Oracle tag as well, here you go: TO_CHAR function with the 'mm' format mask fetches month from the date (01 for January, 02 for February, etc.). As there's an aggregate function (count) involved, you have to GROUP BY column that isn't aggregated. (As far as I can tell, that is valid for any other database.)
SQL> with test (period, company) as
2 (select date '2018-01-01', 'A' from dual union all
3 select date '2018-01-02', 'A' from dual union all
4 select date '2018-02-04', 'A' from dual union all
5 select date '2018-02-05', 'B' from dual union all
6 select date '2018-03-06', 'B' from dual union all
7 select date '2018-04-07', 'C' from dual union all
8 select date '2018-05-08', 'A' from dual
9 )
10 select to_char(period, 'mm') month, count(*)
11 from test
12 group by to_char(period, 'mm')
13 order by 1;
MO COUNT(*)
-- ----------
01 2
02 2
03 1
04 1
05 1
SQL>

How to calculate MTD and QTD by YTD value in Oracle

There are some data in my table t1 looks like below:
date dealer YTD_Value
2018-01 A 1100
2018-02 A 2000
2018-03 A 3000
2018-04 A 4200
2018-05 A 5000
2018-06 A 5500
2017-01 B 100
2017-02 B 200
2017-03 B 500
... ... ...
then I want to write a SQL to query this table and get below result:
date dealer YTD_Value MTD_Value QTD_Value
2018-01 A 1100 1100 1100
2018-02 A 2000 900 2000
2018-03 A 3000 1000 3000
2018-04 A 4200 1200 1200
2018-05 A 5000 800 2000
2018-06 A 5500 500 2500
2017-01 B 100 100 100
2017-02 B 200 100 200
2017-03 B 550 350 550
... ... ... ... ...
'YTD' means Year to date
'MTD' means Month to date
'QTD' means Quarter to date
So if I want to calculate MTD and QTD value for dealer 'A' in '2018-01', it should be the same as YTD.
If I want to calculate MTD value for dealer 'A' in '2018-06', MTD value should equal to YTD value in '2018-06' minus YTD value in '2018-05'. And the QTD value in '2018-06' should equal to YTD value in '2018-06' minus YTD value in '2018-03' or equal to sum MTD value in (2018-04,2018-05,2018-06)
The same rule for other dealers such as B.
How can I write the SQL to achieve this purpose?
The QTD calculation is tricky, but you can do this query without subqueries. The basic idea is to do a lag() for the monthly value. Then use a max() analytic function to get the YTD value at the beginning of the quarter.
Of course, the first quarter of the year has no such value, so a coalesce() is needed.
Try this:
with t(dte, dealer, YTD_Value) as (
select '2018-01', 'A', 1100 from dual union all
select '2018-02', 'A', 2000 from dual union all
select '2018-03', 'A', 3000 from dual union all
select '2018-04', 'A', 4200 from dual union all
select '2018-05', 'A', 5000 from dual union all
select '2018-06', 'A', 5500 from dual union all
select '2017-01', 'B', 100 from dual union all
select '2017-02', 'B', 200 from dual union all
select '2017-03', 'B', 550 from dual
)
select t.*,
(YTD_Value - lag(YTD_Value, 1, 0) over (partition by substr(dte, 1, 4) order by dte)) as MTD_Value,
(YTD_Value -
coalesce(max(case when substr(dte, -2) in ('03', '06', '09') then YTD_VALUE end) over
(partition by substr(dte, 1, 4) order by dte rows between unbounded preceding and 1 preceding
), 0
)
) as QTD_Value
from t
order by 1
Here is a db<>fiddle.
The following query should do the job. It uses a CTE that translates the varchar date column to dates, and then a few joins to recover the value to compare.
I tested it in this db fiddle and the output matches your expected results.
WITH cte AS (
SELECT TO_DATE(my_date, 'YYYY-MM') my_date, dealer, ytd_value FROM my_table
)
SELECT
TO_CHAR(ytd.my_date, 'YYYY-MM') my_date,
ytd.ytd_value,
ytd.dealer,
ytd.ytd_value - NVL(mtd.ytd_value, 0) mtd_value,
ytd.ytd_value - NVL(qtd.ytd_value, 0) qtd_value
FROM
cte ytd
LEFT JOIN cte mtd ON mtd.my_date = ADD_MONTHS(ytd.my_date, -1) AND mtd.dealer = ytd.dealer
LEFT JOIN cte qtd ON qtd.my_date = ADD_MONTHS(TRUNC(ytd.my_date, 'Q'), -1) AND mtd.dealer = qtd.dealer
ORDER BY dealer, my_date
PS : date is a reserved word in most RDBMS (including Oracle), I renamed that column to my_date in the query.
You can use lag() windows analytic and sum() over .. aggregation functions as :
select "date",dealer,YTD_Value,MTD_Value,
sum(MTD_Value) over (partition by qt order by "date")
as QTD_Value
from
(
with t("date",dealer,YTD_Value) as
(
select '2018-01','A',1100 from dual union all
select '2018-02','A',2000 from dual union all
select '2018-03','A',3000 from dual union all
select '2018-04','A',4200 from dual union all
select '2018-05','A',5000 from dual union all
select '2018-06','A',5500 from dual union all
select '2017-01','B', 100 from dual union all
select '2017-02','B', 200 from dual union all
select '2017-03','B', 550 from dual
)
select t.*,
t.YTD_Value - nvl(lag(t.YTD_Value)
over (partition by substr("date",1,4) order by substr("date",1,4) desc, "date"),0)
as MTD_Value,
substr("date",1,4)||to_char(to_date("date",'YYYY-MM'),'Q')
as qt,
substr("date",1,4) as year
from t
order by year desc, "date"
)
order by year desc, "date";
Rextester Demo

Get rows from current month if older is not available

I have a table that looks like this:
+--------------------+---------+
| Month (date) | amount |
+--------------------+---------+
| 2016-10-01 | 20 |
| 2016-08-01 | 10 |
| 2016-07-01 | 17 |
+--------------------+---------+
I'm looking for a query (sql statement) which satisfies the following conditions:
Give me the value of the previous month.
If there is no value for the previous month lock back in time until one can be found.
If there is just a value for the current month give me this value.
In the example table the row I'm looking for would be this:
+--------------------+---------+
| 2016-08-01 | 10 |
+--------------------+---------+
Has anyone a idea for a non complex select query?
Thanks in advance,
Peter
You may need the following:
SELECT *
FROM ( SELECT *
FROM test
WHERE TRUNC(SYSDATE, 'month') >= month
ORDER BY CASE
WHEN TRUNC(SYSDATE, 'month') = month
THEN 0 /* if current month, ordered last */
ELSE 1 /* previous months are ordered first */
END DESC,
month DESC /* among previous months, the greatest first */
)
WHERE ROWNUM = 1
Another way using MAX
WITH tbl AS (
SELECT TO_DATE('2016-10-01', 'YYYY-MM-DD') AS "month", 20 AS amount FROM dual
UNION
SELECT TO_DATE('2016-08-01', 'YYYY-MM-DD') AS "month", 10 AS amount FROM dual
UNION
SELECT TO_DATE('2016-07-01', 'YYYY-MM-DD') AS "month", 5 AS amount FROM dual
)
SELECT *
FROM tbl
WHERE TRUNC("month", 'MONTH') = NVL((SELECT MAX(t."month")
FROM tbl t
WHERE t."month" < TRUNC(SYSDATE, 'MONTH')),
TRUNC(SYSDATE, 'MONTH'));
I would use row_number():
select t.*
from (select t.*,
row_number() over (order by (case when to_char(dte, 'YYYY-MM') = to_char(sysdate, 'YYYY-MM') then 1 else 2 end) desc,
dte desc
) as seqnum
from t
) t
where seqnum = 1;
Actually, you don't need row_number() for this:
select t.*
from (select t.*
from t
order by (case when to_char(dte, 'YYYY-MM') = to_char(sysdate, 'YYYY-MM') then 1 else 2 end) desc,
dte desc
) t
where rownum = 1;
It's not the nicest query but it should work.
select amount, date from (
select amount, date, row_number over(partition by HERE_PUT_ID order by
case trunc(date, 'month') when trunc(sysdate, 'month') then to_date('00010101', 'yyyymmdd') else trunc(date, 'month') end
desc) r)
where r = 1;
I guess you have some id in table so put id column instead of HERE_PUT_ID if you want query for whole table just delete: partition by HERE_PUT_ID
I added more data for testing, and an "id" column (a more realistic scenario) to show how this would work. If there is no "id" in your data, simply delete any reference to it from the solution.
Notes - month is a reserved Oracle word, don't use it as a column name. The solution assumes the date column contains dates that are already truncated to the beginning of the month. The trick in "order by" in the dense_rank last is to assign a value (ANY value!) when the month is the current month; by default, the value assigned to all other months is NULL, which by default come after any non-null value in an ascending order.
You may want to test the various solutions for efficiency if execution time is important.
with
inputs ( id, mth, amount ) as (
select 1, date '2016-10-01', 20 from dual union all
select 1, date '2016-08-01', 10 from dual union all
select 1, date '2016-07-01', 17 from dual union all
select 2, date '2016-10-01', 30 from dual union all
select 2, date '2016-09-01', 25 from dual union all
select 3, date '2016-10-01', 20 from dual union all
select 4, date '2016-08-01', 45 from dual union all
select 4, date '2016-06-01', 30 from dual
)
-- end of TEST DATA - the solution (SQL query) is below this line
select id,
max(mth) keep(dense_rank last order by
case when mth = trunc(sysdate, 'mm') then 0 end, mth) as mth,
max(amount) keep(dense_rank last order by
case when mth = trunc(sysdate, 'mm') then 0 end, mth) as amount
from inputs
group by id
order by id -- ORDER BY is optional
;
ID MTH AMOUNT
--- ---------- -------
1 2016-08-01 10
2 2016-09-01 25
3 2016-10-01 20
4 2016-08-01 45
You could sort the data in the direction you want to:
with MyData as
(
SELECT to_date('2016-10-01','YYYY-MM-DD') MY_DATE, 20 AMOUNT FROM DUAL UNION
SELECT to_date('2016-08-01','YYYY-MM-DD') MY_DATE, 10 AMOUNT FROM DUAL UNION
SELECT to_date('2016-07-01','YYYY-MM-DD') MY_DATE, 17 AMOUNT FROM DUAL
),
MyResult AS (
SELECT
D.*
FROM MyData D
ORDER BY
DECODE(
12*TO_CHAR(MY_DATE,'YYYY') + TO_CHAR(MY_DATE,'MM'),
12*TO_CHAR(SYSDATE,'YYYY') + TO_CHAR(SYSDATE,'MM'),
-1,
12*TO_CHAR(MY_DATE,'YYYY') + TO_CHAR(MY_DATE,'MM'))
DESC
)
SELECT * FROM MyResult WHERE RowNum = 1