Get month over month increase in usage for each customer - sql

I have the following table:
DECLARE #MyTable TABLE (
CustomerName nvarchar(max),
[Date] date,
[Service] nvarchar(max),
UniqueUsersForService int
)
INSERT INTO #MyTable VALUES
('CompanyA', '2016-07-14', 'Service1', 100),
('CompanyA', '2016-07-15', 'Service1', 110),
('CompanyA', '2016-07-16', 'Service1', 120),
('CompanyA', '2016-07-14', 'Service2', 200),
('CompanyA', '2016-07-15', 'Service2', 220),
('CompanyA', '2016-07-16', 'Service2', 500),
('CompanyB', '2016-07-14', 'Service1', 10000),
('CompanyB', '2016-07-15', 'Service1', 10500),
('CompanyB', '2016-07-16', 'Service1', 11000),
('CompanyB', '2016-07-14', 'Service2', 200),
('CompanyB', '2016-07-15', 'Service2', 300),
('CompanyB', '2016-07-16', 'Service2', 300)
Basically it's a list that shows how many people used each service for each company. For instance, in CopmanyA, on the 14th of July, 100 unique users used Service1. The actual table contains thousands of customers and dates going back to the 1st of Jan 2015.
I've been researching online for a way to be able to calculate the usage increase month-over-month for each service per customer. What I managed to do so far: I grouped the dates by months.
For instance the date 7/14/2016 is 201607 (the 7th month of 2016) and selected the maximum usage for the respective month. So now I need to figure out how to calculate the difference in usage between June and July for example.
To somehow subtract the usage of June from the one in July. And so on for each month. The end goal is to identify the customers that had the biggest increase in usage - percentagewise. I want to be able to look at the data and say CompanyA was using 100 licenses in March and in April he jumped to 1000. That's a 1000% increase.
I apologize for the way I phrased the question, I am very new to SQL and coding in general and I thank you in advance for any help I might get.

If you are using SQL Server 2012 (and up) you can use LAG function:
;WITH cte AS (
SELECT CustomerName,
LEFT(REPLACE(CONVERT(nvarchar(10),[Date],120),'-',''),6) as [month],
[Service],
MAX(UniqueUsersForService) as MaxUniqueUsersForService
FROM #MyTable
GROUP BY CustomerName,
LEFT(REPLACE(CONVERT(nvarchar(10),[Date],120),'-',''),6),
[Service]
)
SELECT *,
LAG(MaxUniqueUsersForService,1,NULL) OVER (PARTITION BY CustomerName, [Service] ORDER BY [month]) as prevUniqueUsersForService
FROM cte
ORDER BY CustomerName, [month], [Service]
In SQL Server 2008:
;WITH cte AS (
SELECT CustomerName,
LEFT(REPLACE(CONVERT(nvarchar(10),[Date],120),'-',''),6) as [month],
[Service],
MAX(UniqueUsersForService) as MaxUniqueUsersForService
FROM #MyTable
GROUP BY CustomerName,
LEFT(REPLACE(CONVERT(nvarchar(10),[Date],120),'-',''),6),
[Service]
)
SELECT c.*,
p.MaxUniqueUsersForService as prevUniqueUsersForService
FROM cte c
OUTER APPLY (SELECT TOP 1 * FROM cte WHERE CustomerName = c.CustomerName AND [Service] = c.[Service] and [month] < c.[month]) as p

If you're using SQL Server 2012 or newer, try this:
SELECT *
, CASE
WHEN uniqueUsersPrevMonth = 0 THEN uniqueUsersInMonth
ELSE CAST(uniqueUsersInMonth - uniqueUsersPrevMonth as decimal) / uniqueUsersPrevMonth * 100
END AS Increase
FROM (
SELECT customer, service, DATEPART(MONTH, [date]) as [month]
, SUM(uniqueUsers) AS uniqueUsersInMonth
, LAG(SUM(uniqueUsers),1,0) OVER(PARTITION BY customer, service ORDER BY DATEPART(MONTH, [date])) as uniqueUsersPrevMonth
FROM #tbl AS t
GROUP BY customer, service, DATEPART(MONTH, [date])
) AS t1
ORDER BY customer, service, [month]

Related

SQL Server - How can I query to return products only when their sales exceeds a certain percentage?

The basic requirement is this: We capture sales by day of week and product. If more than half* of the day's sales came from one product, we want to capture that. Else we show "none".
So image we sell shoes, pants and shirts. On Monday, we sold $100 of each. So it was a three way split, and each category accounted for 33.3% of sales. We show "none". On Tuesday though, half of our sales came from shoes, and on Wednesday, 80% from shirts. So we want to see that.
The query below returns the desired result, but I'm not a fan of a queries within queries within queries. They can be inefficient and hard to read, and I feel like there's a cleaner way. Can this be improved upon?
*The requirement for half will be a parameter (#threshold here). In some cases, we might want to show only when it's 75% or more of sales. Obviously that parameter has to be >= 50%.
declare #sales as table (day_of_week varchar(16), product varchar(8), sales_amt int)
insert into #sales values ('monday', 'shoes', 100)
insert into #sales values ('monday', 'pants', 100)
insert into #sales values ('monday', 'shirts', 100)
insert into #sales values ('tuesday', 'shoes', 500)
insert into #sales values ('tuesday', 'pants', 300)
insert into #sales values ('tuesday', 'shirts', 200)
insert into #sales values ('wednesday', 'shoes', 100)
insert into #sales values ('wednesday', 'pants', 100)
insert into #sales values ('wednesday', 'shirts', 800)
declare #threshold as decimal(3,2) = 0.5
select day_of_week, case when pct_of_day >= #threshold then product else 'none' end half_of_sales from (
select day_of_week, product, pct_of_day, row_number() over (partition by day_of_week order by pct_of_day desc) _rn
from (
select day_of_week, product, sum(sales_amt) * 1.0 / sum(sum(sales_amt)) over (partition by day_of_week) pct_of_day
from #sales
group by day_of_week, product
) x
) z
where _rn = 1
maybe a little easier to read?
DECLARE #threshold AS decimal(3, 2) = 0.5;
WITH ssum
AS (SELECT
day_of_week,
SUM(sales_amt) sa
FROM #sales
GROUP BY day_of_week)
SELECT
s.day_of_week,
MAX(CASE WHEN s.sales_amt * 1.0 / ssum.sa >= #threshold THEN s.product ELSE 'none' END) threshold
FROM ssum
INNER JOIN #sales AS s
ON ssum.day_of_week = s.day_of_week
GROUP BY s.day_of_week
Firstly, you can place the nested queries in CTEs, which can make them easier to read. It won't make them more efficient, but then nested queries are not necessarily inefficient in themselves, not sure why you think so
Second, the query could be optimized, because the row-numbering is equally valid on the non-percentaged sum(sales_amt) value, so it can be on the same level as the windowed sum over
declare #threshold as decimal(3,2) = 0.5;
with GroupedSales as (
select
day_of_week,
product,
sum(sales_amt) * 1.0 / sum(sum(sales_amt)) over (partition by day_of_week) pct_of_day,
row_number() over (partition by day_of_week order by sum(sales_amt) desc) _rn
from #sales
group by
day_of_week,
product
)
select
day_of_week,
case when pct_of_day >= #threshold
then product
else 'none'
end half_of_sales
from GroupedSales
where _rn = 1;

Find date of most recent overdue

I have the following problem: from the table of pays and dues, I need to find the date of the last overdue. Here is the table and data for example:
create table t (
Id int
, [date] date
, Customer varchar(6)
, Deal varchar(6)
, Currency varchar(3)
, [Sum] int
);
insert into t values
(1, '2017-12-12', '1110', '111111', 'USD', 12000)
, (2, '2017-12-25', '1110', '111111', 'USD', 5000)
, (3, '2017-12-13', '1110', '122222', 'USD', 10000)
, (4, '2018-01-13', '1110', '111111', 'USD', -10100)
, (5, '2017-11-20', '2200', '222221', 'USD', 25000)
, (6, '2017-12-20', '2200', '222221', 'USD', 20000)
, (7, '2017-12-31', '2201', '222221', 'USD', -10000)
, (8, '2017-12-29', '1110', '122222', 'USD', -10000)
, (9, '2017-11-28', '2201', '222221', 'USD', -30000);
If the value of "Sum" is positive - it means overdue has begun; if "Sum" is negative - it means someone paid on this Deal.
In the example above on Deal '122222' overdue starts at 2017-12-13 and ends on 2017-12-29, so it shouldn't be in the result.
And for the Deal '222221' the first overdue of 25000 started at 2017-11-20 was completly paid at 2017-11-28, so the last date of current overdue (we are interested in) is 2017-12-31
I've made this selection to sum up all the payments, and stuck here :(
WITH cte AS (
SELECT *,
SUM([Sum]) OVER(PARTITION BY Deal ORDER BY [Date]) AS Debt_balance
FROM t
)
Apparently i need to find (for each Deal) minimum of Dates if there is no 0 or negative Debt_balance and the next date after the last 0 balance otherwise..
Will be gratefull for any tips and ideas on the subject.
Thanks!
UPDATE
My version of solution:
WITH cte AS (
SELECT ROW_NUMBER() OVER (ORDER BY Deal, [Date]) id,
Deal, [Date], [Sum],
SUM([Sum]) OVER(PARTITION BY Deal ORDER BY [Date]) AS Debt_balance
FROM t
)
SELECT a.Deal,
SUM(a.Sum) AS NET_Debt,
isnull(max(b.date), min(a.date)),
datediff(day, isnull(max(b.date), min(a.date)), getdate())
FROM cte as a
LEFT OUTER JOIN cte AS b
ON a.Deal = b.Deal AND a.Debt_balance <= 0 AND b.Id=a.Id+1
GROUP BY a.Deal
HAVING SUM(a.Sum) > 0
I believe you are trying to use running sum and keep track of when it changes to positive, and it can change to positive multiple times and you want the last date at which it became positive. You need LAG() in addition to running sum:
WITH cte1 AS (
-- running balance column
SELECT *
, SUM([Sum]) OVER (PARTITION BY Deal ORDER BY [Date], Id) AS RunningBalance
FROM t
), cte2 AS (
-- overdue begun column - set whenever running balance changes from l.t.e. zero to g.t. zero
SELECT *
, CASE WHEN LAG(RunningBalance, 1, 0) OVER (PARTITION BY Deal ORDER BY [Date], Id) <= 0 AND RunningBalance > 0 THEN 1 END AS OverdueBegun
FROM cte1
)
-- eliminate groups that are paid i.e. sum = 0
SELECT Deal, MAX(CASE WHEN OverdueBegun = 1 THEN [Date] END) AS RecentOverdueDate
FROM cte2
GROUP BY Deal
HAVING SUM([Sum]) <> 0
Demo on db<>fiddle
You can use window functions. These can calculate intermediate values:
Last day when the sum is negative (i.e. last "good" record).
Last sum
Then you can combine these:
select deal, min(date) as last_overdue_start_date
from (select t.*,
first_value(sum) over (partition by deal order by date desc) as last_sum,
max(case when sum < 0 then date end) over (partition by deal order by date) as max_date_neg
from t
) t
where last_sum > 0 and date > max_date_neg
group by deal;
Actually, the value on the last date is not necessary. So this simplifies to:
select deal, min(date) as last_overdue_start_date
from (select t.*,
max(case when sum < 0 then date end) over (partition by deal order by date) as max_date_neg
from t
) t
where date > max_date_neg
group by deal;

How to divide results into separate rows based on year?

I have a query that looks at profits and operations costs of different stores based on the fiscal year, and currently the fiscal years and variables are sorted into single, respective columns such as:
FiscalYear Metric Store Amount
2017 Profit A 220
2017 Cost A 180
2018 Profit B 200
2018 Cost B 300
...
I need to cross tab the rows so that for each store, I can compare the 2017 profit against the 2018 profit, and 2017 cost against the 2018 cost.
I broke out profits and costs by creating CASE WHEN statements for the ProfitLossTable, but I don't know how to make it create a "2017 Profit" and "2018 Profit" column, respectively, for each Store.
WITH [Profits, Cost] AS
(
SELECT ID, StoreID, Number, FYYearID,
CASE WHEN ID = 333 then Number END AS Profit
CASE WHEN ID = 555 then Number END AS Cost
FROM ProfitLossTable
),
Location AS
(
Select StoreID, StoreName
FROM StoreTable
),
FiscalMonth AS
(
SELECT FYYearID, FYYear
FROM FiscalMonthTable
)
SELECT A.Profit, A.Cost
FROM [Profits, Cost] A
JOIN Location B
ON A.StoreID = B.StoreID
JOIN FiscalMonth C
ON A.FYYearID = C.FYYearID
The code above shows this, and I feel like I am close to creating columns based on year, but I don't know what to do next.
FiscalYear Store Profit Cost
2017 A 220 100
2017 A 180 100
2018 B 200 100
2018 B 300 100
As a working (on my machine anyway ;-p) example using your data:
create table #temp(
FiscalYear int not null,
Metric nvarchar(50) not null,
Store nvarchar(10) not null,
Amount int not null
)
insert into #temp
values
(2017, N'Profit', N'A', 220),
(2017, N'Cost', N'A', 180),
(2018, N'Profit', N'B', 200),
(2018, N'Cost', N'B', 300)
select * from #temp
select Metric,
[2017] as [2017],
[2018] as [2018]
from (select FiscalYear, Amount, Metric from #temp) base_data
PIVOT
(SUM(Amount) FOR FiscalYear in ([2017], [2018])
) as pvt
order by pvt.Metric
drop table #temp

Calculating Percentages with SUM and Group by

I am trying to create an Over Time Calculation based on some set criteria. It goes as follows.
Overtime is posted on any day that is over 8 hrs but an employee has to reach 40 total hrs first and the calculation starts at the 1st day moving forward in the week. The Overtime is calculated based on the percentage taken of the SUM total of the cost codes worked.
First you have to find the percentage of each cost code worked for the entire week per employee id. See Example below
Then each day that is Over 8 hrs you take the time on that code for the day and multiply it by the calculated percentage. At the end of the week the regular hours must total 40hrs if they have gone over 40 for the week. See below example
CREATE TABLE [Totals](
[Day] nvarchar (10) null,
[EmployeeID] [nvarchar](100) NULL,
[CostCode] [nvarchar](100) NULL,
[TotalTime] [real] NULL,)
INSERT Into Totals (day,employeeid, CostCode, TotalTime) VALUES
('1','1234','1', 2),
('1','1234','2', 7.5),
('2','1234','1', 1.5),
('2','1234','2', 8),
('3','1234','1', 1),
('3','1234','2', 6),
('4','1234','1', 2),
('4','1234','2', 8),
('5','1234','1', 2),
('5','1234','2', 8),
('1','4567','1', 2),
('1','4567','2', 8.5),
('2','4567','1', 1.5),
('2','4567','2', 7.6),
('3','4567','1', 1),
('3','4567','2', 5),
('4','4567','1', 2),
('4','4567','2', 8),
('5','4567','1', 2),
('5','4567','2', 8)
To get the percentage of each cost Worked it is the SUM total time of each cost per week / SUM total time of the entire week
SELECT employeeid,CostCode,SUM(totaltime) As TotalTime ,
ROUND(SUM(Totaltime) / (select SUM(TotalTime) from Totals where employeeid = '1234') * 100,0) as Percentage
from Totals WHERE EmployeeID = '1234' group by EmployeeID, CostCode
Percentage Calculated for the Week by Cost = 18% worked on Cost 1 and 82% on Cost 2
I would like to take the percentage results for the week and calculate the total time each day in the query
Results Example Day 1: for EmployeeID 1234
Day CostCode RegTime OverTime
1 1 1.73 .27
1 2 6.27 1.23
After editing i get your result, try this:
select calc.*
--select [day], CostCode, EmployeeID
--, CPr * DayEmpRT RegTime_old
, TotalTime - CPr * DayEmpOT RegTime
, CPr * DayEmpOT OverTime
from (
select Agr.*
--, round(EmpC1T / EmpT, 2) C1Pr
--, round(1 - (EmpC1T / EmpT), 2) C2Pr
, round(EmpCT / EmpT, 2) CPr
, case when DayEmpT > 8 then 8 else DayEmpT end DayEmpRT
, case when DayEmpT > 8 then DayEmpT - 8 else 0 end DayEmpOT
from (
select Totals.*
, SUM(TotalTime) over (partition by EmployeeID, [day]) DayEmpT
--, SUM(case when CostCode = 1 then TotalTime end) over (partition by EmployeeID) EmpC1T
, SUM(TotalTime) over (partition by EmployeeID, CostCode) EmpCT
, SUM(TotalTime) over (partition by EmployeeID) EmpT
from Totals
WHERE EmployeeID = '1234' ) Agr ) calc
order by 1,2,3
here is simplest way to calculate this:
select calc.*
, TotalTime * pr4R RegTime
, TotalTime * pr4O OverTime
from(
select Agr.*
, case when EmpT > 40 then round(40/EmpT, 2) else 1 end pr4R
, case when EmpT > 40 then round(1 - 40/EmpT, 2) else 1 end pr4O
from (
select Totals.*
, SUM(TotalTime) over (partition by EmployeeID) EmpT
from Totals
WHERE EmployeeID = '1234' ) Agr ) calc
but be watch on day 3, because there is only 7h.
The 1st query calculate days separately and live day 3.
The 2nd query scale all hours.
it could be another one, that calculate all emp rows but scale separatly RegTime and OverTime, with exception on day where is no 8h and increment it to 8h from OverTime.
This should help you get started...
-- % based on hours worked for each code on a DAILY basis (The original 21% in the question was based on this)
SELECT
T.EmployeeId,
T.Day,
T.CostCode,
T.TotalTime,
CAST(100.0 * T.TotalTime / X.DailyHours AS DECIMAL(10,2)) AS PercentageWorked
FROM #Totals T
INNER JOIN (
SELECT
EmployeeId,
Day,
SUM(TotalTime) AS DailyHours
FROM #Totals
GROUP BY EmployeeId, Day
) X ON X.EmployeeId = T.EmployeeId AND X.Day = T.Day
-- % based on hours worked for each code on a WEEKLY basis (The revised question)
SELECT
T.EmployeeId,
T.CostCode,
SUM(T.TotalTime) AS TotalTime,
CAST(100.0 * SUM(T.TotalTime) / X.WeeklyHours AS DECIMAL(10,2)) AS PercentageWorked
FROM #Totals T
INNER JOIN (
SELECT
EmployeeId,
SUM(TotalTime) AS WeeklyHours
FROM #Totals
GROUP BY EmployeeId
) X ON X.EmployeeId = T.EmployeeId
GROUP BY
T.EmployeeId,
T.CostCode,
X.WeeklyHours

SQL Server Query to Count Number of Changing Values in a Column Sequentially

I need to count the number of changing values in a column sequentially. Please see image for illustration (correct or expected output)
In here, the column Area is changing, counter column should display the sequential counter based on the changing values in area.
I have started with this code
SELECT a.tenant, a.area, a.date , a.gsc, f.counter
FROM TENANT a
inner join
(SELECT a.tenant, COUNT(DISTINCT e.Area) AS counter
FROM TENANT
GROUP BY tenant
) AS f ON a.tenant = f.tenant
order by a.tenant, a.date
And gives me this output. Counting the number of distinct values found in Area column IN ALL rows.
Here's one way to do it using window functions:
SELECT tenant, area, [date], sales,
DENSE_RANK() OVER (ORDER BY grpOrder) AS counter
FROM (
SELECT tenant, area, date, sales,
MIN([date]) OVER (PARTITION BY area, grp) AS grpOrder
FROM (
SELECT tenant, area, [date], sales,
ROW_NUMBER() OVER (ORDER BY date) -
ROW_NUMBER() OVER (PARTITION BY area ORDER BY [date]) AS grp
FROM tenant ) AS t ) AS s
The inner query identifies islands of consecutive area values. See grp value in below partial output from this sub-query:
area date grp
--------------------
18 2015-01-01 0
18 2015-01-02 0
18 2015-01-05 2
18 2015-01-06 2
20 2015-01-03 2
20 2015-01-04 2
Using window version of MIN we can calculate grp order: field grpOrder holds the minimum date per group.
Using DENSE_RANK() in the outer query we can now easily calculate counter values: first group gets a value of 1, next group a value of 2, etc.
Demo here
You can do it like this with window functions:
declare #data table(name varchar(10), area int, dates datetime, sales int)
insert into #data(name, area, dates, sales) values
('Little Asia', 18, '20150101', 10)
, ('Little Asia', 18, '20150102', 20)
, ('Little Asia', 20, '20150103', 30)
, ('Little Asia', 20, '20150104', 10)
, ('Little Asia', 18, '20150105', 20)
, ('Little Asia', 18, '20150106', 30)
Select name, area, dates, sales
, [counter] = DENSE_RANK() over(order by c)
, [count] = Count(*) over(partition by n ,c)
From (
Select name, area, dates, sales, n
, c = ROW_NUMBER() over(order by n, dates) - ROW_NUMBER() over(partition by area, n order by dates)
From (
Select name, area, dates, sales
, n = ROW_NUMBER() over(order by dates) - ROW_NUMBER() over(partition by area order by dates)
From #data
) as x
) as v
order by dates
Output:
name area dates sales counter count
Little Asia 18 2015-01-01 10 1 2
Little Asia 18 2015-01-02 20 1 2
Little Asia 20 2015-01-03 30 2 2
Little Asia 20 2015-01-04 10 2 2
Little Asia 18 2015-01-05 20 3 2
Little Asia 18 2015-01-06 30 3 2
In case there is an extra element in the tenant column
create table #tenant (tenant varchar(20), area int, date date, sales int)
insert into #tenant values
('little asia', 18, '20150101', 10),
('little asia', 18, '20150102', 20),
('little asia', 20, '20150103', 30),
('little asia', 20, '20150104', 10),
('little asia', 18, '20150105', 20),
('little asia', 18, '20150106', 30),
('little', 18, '20150101', 10),
('little', 18, '20150102', 20),
('little', 18, '20150103', 30),
('little', 18, '20150104', 10),
('little', 18, '20150105', 20),
('little', 11, '20150106', 30);
The code will be written as follows:
/* new code adding tenant*/
SELECT tenant, area, [date], sales,
DENSE_RANK() OVER (PARTITION BY tenant ORDER BY tenant, grpOrder) AS counter
FROM (
SELECT tenant, area, date, sales,
MIN([date]) OVER (PARTITION BY tenant, area, grp) AS grpOrder
FROM (
SELECT tenant, area, [date], sales,
ROW_NUMBER() OVER (PARTITION BY tenant ORDER BY tenant, date) -
ROW_NUMBER() OVER (PARTITION BY tenant, area ORDER BY tenant, [date]) AS grp
FROM #tenant ) AS t ) AS s
order by tenant, date
As long as there is a difference of atleast 1 (#threshold) we you will start a new group. this will partition by tenant.
DECLARE #Table as TABLE (
Tenant varchar(20),
Area int,
[date] Date,
Sales int
)
INSERT INTO #Table
VALUES
('Little Asia',18,'1/1/2015', 10),
('Little Asia',18,'1/2/2015', 20),
('Little Asia',20,'1/3/2015', 30),
('Little Asia',20,'1/4/2015', 10),
('Little Asia',18,'1/5/2015', 20),
('Little Asia',18,'1/6/2015', 30)
/***** Begin Query *****/
DECLARE #Threshold INT = 1
;WITH C1 AS
(
SELECT Tenant, Area, [Date], Sales,
CASE WHEN ABS(Area - LAG(Area) OVER(PARTITION BY Tenant ORDER BY [Date])) <= #Threshold THEN NULL ELSE 1 END AS isstart
FROM #Table
),
C2 AS
(
SELECT Tenant, Area, [Date], Sales, COUNT(isstart) OVER( PARTITION BY Tenant ORDER BY [Date] ROWS UNBOUNDED PRECEDING) AS grp
FROM C1
)
SELECT * FROM C2
The accepted answer works well and the SQL Fiddle demo is great. However, it doesn't take into account the situation with multiple tenants.
I extended the SQL Fiddle answer and the link is here for those people whose datasets comprise multiple tenants, simply by ensuring that tenant was present in each PARTITION BY and ORDER BY.