SQL Rolling Total up to a certain date - sql

I have two tables that I'm working with. Let's call them "Customers" and "Points".
The Points table looks like this:
Account Year M01 M02 M03 M04 M05 M06 M07 M08 M09 M10 M11 M12
123 2011 10 0 0 0 10 0 10 0 0 0 0 10
123 2012 0 0 0 0 10 0 0 10 10 10 10 20
123 2013 5 0 0 0 0 0 0 0 0 0 0 0
But these points work on a rolling 12 months. Calculating a current customer's points is simple enough, but the challenge is for customers who are no longer active. Say Customer 123 became inactive on Jan 2013, we would only want to calculate Feb'12-Jan'13. This is where the other table, Customers, comes in, let's simplify and say it looks just like this:
Account End Date
123 20130105
Now, what I want to do is create a query that calculates the amount of points that each customer has. (Current 12 months for active customers, last 12 months they were active for customers who are no longer active.)
Here's some more information:
I'm running SQL Server 2008.
These tables have been supplied to me like this, I can't modify them.
An active customer is one who has an end date of 99991231 (Dec 31 9999)
The points table only populates for years that the customer is an active customer. Aka, someone becomes an active customer Feb 2009, they have an entry for the year 2009, if they became inactive in July 2009, their points is only calculating Feb-July 2009, there is no row for 2008 because they weren't a customer back then. Jan & Aug-Dec 2009 will show 0's.
Additionally, the record is only created if the customer gains any points that year. If a customer gets 0 points in a year, there will be no record of it.
For border cases, if you get into the first day of a month, then that month is counted. Example, let's say today is April 1st, 2013, that means we sum up May'12-April'13.
This is a pretty complex question. If there's anything I can explain better please let me know. Thank you!

Unfortunately with your table structure of points you will have to unpivot the data. An unpivot takes the data from the multiple columns into rows. Once the data is in the rows, it will be much easier to join, filter the data and total the points for each account. The code to unpivot the data will be similar to this:
select account,
cast(cast(year as varchar(4))+'-'+replace(month_col, 'M', '')+'-01' as date) full_date,
pts
from points
unpivot
(
pts
for month_col in ([M01], [M02], [M03], [M04], [M05], [M06], [M07], [M08], [M09], [M10], [M11], [M12])
) unpiv
See SQL Fiddle with Demo. The query gives a result similar to this:
| ACCOUNT | FULL_DATE | PTS |
------------------------------
| 123 | 2011-01-01 | 10 |
| 123 | 2011-02-01 | 0 |
| 123 | 2011-03-01 | 0 |
| 123 | 2011-04-01 | 0 |
| 123 | 2011-05-01 | 10 |
Once the data is in this format, you can join the Customers table to get the total points for each account, so the code will be similar to the following:
select
c.account, sum(pts) TotalPoints
from customers c
inner join
(
select account,
cast(cast(year as varchar(4))+'-'+replace(month_col, 'M', '')+'-01' as date) full_date,
pts
from points
unpivot
(
pts
for month_col in ([M01], [M02], [M03], [M04], [M05], [M06], [M07], [M08], [M09], [M10], [M11], [M12])
) unpiv
) p
on c.account = p.account
where
(
c.enddate = '9999-12-31'
and full_date >= dateadd(year, -1, getdate())
and full_date <= getdate()
)
or
(
c.enddate <> '9999-12-31'
and dateadd(year, -1, [enddate]) <= full_date
and full_date <= [enddate]
)
group by c.account
See SQL Fiddle with Demo

Lousy data structure. The first thing to do is to unpivot it. Then you get a table with year-month-points as the columns.
From here, you can just select the most recent 12 months. In fact, you don't even have to worry about when a customer left, since presumably they have not collected points since then.
Here is an example in SQL:
with points as (
select 123 as account, 2012 as year,
10 as m01, 0 as m02, 0 as m03, 0 as m04, 10 as m05, 0 as m06,
10 as m07, 0 as m08, 0 as m09, 0 as m10, 0 as m11, 10 as m12
),
points_ym as (
select account, YEAR, mon, cast(right(mon, 2) as int) as monnum, points
from points
unpivot (points for mon in (m01, m02, m03, m04, m05, m06, m07, m08, m09, m10, m11, m12)
) as unpvt
)
select account, SUM(points)
from points_ym
where year*12+monnum >= year(getdate())*12+MONTH(getdate()) - 12
group by account

Related

Linear Interpolation in SQL

I work with crashes and mileage for the same year which is Year in table. Crashes are are there for every record, but annual mileage is not. NULLs for mileage could be at the beginning or at the end of the time period for certain customer. Also, couple of annual mileage records can be missing as well. I do not know how to overcome this. I try to do it in CASE statement but then I do not know how to code it properly. Issue needs to be resolved in SQL and use SQL Server.
This is how the output looks like and I need to have mileage for every single year for each customer.
The info I am pulling from is proprietary database and the records themselves should be untouched as is. I just need code in query which will modify my current output to output where I have mileage for every year. I appreciate any input!
Year
Customer
Crashes
Annual_Mileage
2009
123
5
3453453
2010
123
1
NULL
2011
123
0
54545
2012
123
14
376457435
2013
123
3
63453453
2014
123
4
NULL
2015
123
15
6346747
2016
123
0
NULL
2017
123
2
534534
2018
123
7
NULL
2019
123
11
NULL
2020
123
15
565435
2021
123
12
474567546
2022
123
7
NULL
Desired Results
Year
Customer
Crashes
Annual_Mileage
2009
123
5
3453453
2010
123
1
175399 (prior value is taken)
2011
123
0
54545
2012
123
14
376457435
2013
123
3
63453453
2014
123
4
34900100 (avg of 2 adjacent values)
2015
123
15
6346747
2016
123
0
3440641 (avg of 2 adjacent values)
2017
123
2
534534
2018
123
7
534534 ( prior value is taken)
2019
123
11
549985 (avg of 2 adjacent values)
2020
123
15
565435
2021
123
12
474567546
2022
123
7
474567546 (prior value is taken)
SELECT Year,
Customer,
Crashes,
CASE
WHEN Annual_Mlg IS NOT NULL THEN Annual_Mlg
WHEN Annual_Mlg IS NULL THEN
CASE
WHEN PREV.Annual_Mlg IS NOT NULL
AND NEXT.Annual_Mlg IS NOT NULL
THEN ( PREV.Annual_Mlg + NEXT.Annual_Mlg ) / 2
ELSE 0
END
END AS Annual_Mlg
FROM #table
The above code doesn't work, but I just need to start somehow and that what I have currently.
I understand what I need to do I just do not know how to code it in SQL.
After i applied row_number () function i got this output for first 2 clients and for the rest of the 4 clients row_number() function gave correct output. i have no idea why is that. I thought may be because i used "full join" before to combine mileage and crashes table?
enter image description here
Your use of #table tells me that you're using MS SQL Server (a temporary table, probably in a stored procedure).
You want to:
select all the rows in #table
joined with the matching row (if any) for the previous year, and
joined with the matching row (if any) for the next year
Then it's easy. Assuming the primary key on your #table is composed of the year and customer columns, something like this ought to do you:
select t.year ,
t.customer ,
t.crashes ,
annual_milage = coalesce(
t.annual_milage ,
( coalesce( p.annual_mileage, 0 ) +
coalesce( n.annual_mileage, 0 )
) / 2
)
from #table t -- take all the rows
left join #table p on p.year = t.year - 1 -- with the matching row for
and p.customer = t.customer -- the previous year (if any)
left join #table n on n.year = t.year + 1 -- and the matching row for
and n.customer = t.customer -- the next year (if any)
Notes:
What value you default to if the previous or next year doesn't exist is up to you (zero? some arbitrary value?)
Is the previous/next year guaranteed to be the current year +/- 1?
If not, you may have to use derived tables as the source for the
prev/next data, selecting the closest previous/next year (that sort
of thing rather complicates the query significantly).
Edited To Note:
If you have discontiguous years for each customer such that the "previous" and "next" years for a given customer are not necessarily the current year +/- 1, then something like this is probably the most straightforward way to find the previous/next year.
We use a derived table in our from clause, and assign a sequential number in lieu of year for each customer, using the ranking function row_number() function. This query, then
select row_nbr = row_number() over (
partition by x.customer
order by x.year
) ,
x.*
from #table x
would produce results along these lines:
row_nbr
customer
year
...
1
123
1992
...
2
123
1993
...
3
123
1995
...
4
123
2020
...
1
456
2001
...
2
456
2005
...
3
456
2020
...
And that leads us to this:
select year = t.year ,
customer = t.customer ,
crashes = t.crashes ,
annual_mileage = coalesce(
t.mileage,
coalesce(
t.annual_mileage,
(
coalesce(p.annual_mileage,0) +
coalesce(n.annual_mileage,0)
) / 2
),
)
from (
select row_nbr = row_number() over (
partition by x.customer
order by x.year
) ,
x.*
from #table x
) t
left join #table p on p.customer = t.customer and p.row_nbr = t.row_nbr-1
left join #table n on n.customer = t.customer and n.row_nbr = t.row_nbr+1

Combining Two Tables & Summing REV amts by Mth

Below are my two tables of data
Acct BillingDate REV
101 01/05/2018 5
101 01/30/2018 4
102 01/15/2018 2
103 01/4/2018 3
103 02/05/2018 2
106 03/06/2018 5
Acct BillingDate Lease_Rev
101 01/15/2018 2
102 01/16/2018 1
103 01/19/2018 2
104 02/05/2018 3
105 04/02/2018 1
Desired Output
Acct Jan Feb Mar Apr
101 11
102 3
103 5 2
104 3
105 1
106 5
My SQL Script is Below:
SELECT [NewSalesHistory].[Region]
,[NewSalesHistory].[Account]
,SUM(case when [NewSalesHistory].[billingdate] between '6/1/2016' and '6/30/2016' then REV else 0 end ) + [X].[Jun-16] AS 'Jun-16'
FROM [NewSalesHistory]
FULL join (SELECT [Account]
,SUM(case when [BWLease].[billingdate] between '6/1/2016' and '6/30/2016' then Lease_REV else 0 end ) as 'Jun-16'
FROM [AirgasPricing].[dbo].[BWLease]
GROUP BY [Account]) X ON [NewSalesHistory].[Account] = [X].[Account]
GROUP BY [NewSalesHistory].[Region]
,[NewSalesHistory].[Account]
,[X].[Jun-16]
I am having trouble combining these tables. If there is a rev amt and lease rev amt then it will combine (sum) for that account. If there is not a lease rev amt (which is the majority of the time), it brings back NULLs for all other rev amts accounts in Table 1. Table one can have duplicate accounts with different Rev, while the Table two is one unique account only w Lease rev. The output above is how I would like to see the data.
What am I missing here? Thanks!
I would suggest union all and group by:
select acct,
sum(case when billingdate >= '2016-01-01' and billingdate < '2016-02-01' then rev end) as rev_201601,
sum(case when billingdate >= '2016-02-01' and billingdate < '2016-03-01' then rev end) as rev_201602,
. . .
from ((select nsh.acct, nsh.billingdate, nsh.rev
from NewSalesHistory
) union all
(select bl.acct, bl.billingdate, bl.rev
from AirgasPricing..BWLease bl
)
) x
group by acct;
Okay, so there are a few things going on here:
1) As Gordon Linoff mentioned you can perform a union all on the two tables. Be sure to limit your column selections and name your columns appropriately:
select
x as consistentname1,
y as consistentname2,
z as consistentname3
from [NewSalesHistory]
union all
select
a as consistentname1,
b as consistentname2,
c as consistentname3
from [BWLease]
2) Your desired result contains a pivoted month column. Generate a column with your desired granularity on the result of the union in step one. F.ex. months:
concat(datepart(yy, Date_),'-',datename(mm,Date_)) as yyyyM
Then perform aggregation using a group by:
select sum(...) as desiredcolumnname
...
group by PK1, PK2, yyyyM
Finally, PIVOT to obtain your result: https://learn.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-2017
3) If you have other fields/columns that you wish to present then you first need to determine whether they are measures (can be aggregated) or are dimensions. That may be best addressed in a follow up question after you've achieved what you set out for in this part.
Hope it helps
As an aside, it seems like you are preparing data for reporting. Performing these transformations can be facilitated using a GUI such as MS Power Query. As long as your end goal is not data manipulation in the DB itself, you do not need to resort to raw sql.

SQL Server - Get count for each pharmacy's outbound usage for each month

I am trying to write a query to select the total outbound usage of each pharmacy in my database table for each month.
Here is what I have so far, it outputs the correct data. But I want to eliminate the amount of rows selected
select pharmacyid, count(*) as usage, month(datecalled) as month
from outboundcalldata
where datepart(year, datecalled) = 2014
group by pharmacyid, YEAR(DateCalled), month(datecalled)
order by pharmacyid, month
example of output:
pharmacyid|usage| month
-----------------------
2220000006| 10 | 2
2220000006| 11 | 3
2220000006| 900 | 4
2220000006| 30 | 5
2220000007| 34 | 2
2220000007| 300 | 3
2220000007| 145 | 4
Instead I would like it to output 1 row per pharmacy and a column for each month.
;WITH CTE AS
(
select pharmacyid, count(*) as usage, month(datecalled) as [month]
from outboundcalldata
where datepart(year, datecalled) = 2014
group by pharmacyid, YEAR(DateCalled), month(datecalled)
)
SELECT *
FROM CTE C
PIVOT (SUM(usage)
FOR [month]
IN ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])
)p

Group values in to categories with annual break down

I have a table of licence applications I want to display the data by category for each financial year.
For my query, there are 2 key columns.
Firstly, there is a fee column and the values within this column determine the type of licence.
Between 0 and 300 is Minor
between 300 and 600 is Standard
between 600 and 2000 is Major
Secondly, there is a date field which is to be used for the financial year.
I would like the results to look like this.
Category | 2013/14 | 2012/13
Minor | 23 | 21
Standard | 10 | 11
Major | 5 | 3
I have this query below, but i cant get it right for the year part.
Would really appreciate any advice people can give me.
select category.gr as [category],
sum(case when ((year(licence.[start_date]) in ('2010'))
and (month(licence.[start_date]) in (4,5,6,7,8,9,10,11,12)))
or ((year(licence.[start_date]) in ('2011'))
and (month(licence.[start_date]) in (1,2,3))) then 1 else 0 end) AS '10/11 Count',
from ( select case
when [fee_INC] between 0 and 350 then 'Minor'
when [fee_INC] between 350 and 600 then 'Standard'
else 'Major' end as gr
from [L_LICENCE_FIN]) as category,
from [L_LICENCE_FIN] as licence
group by category.gr
SELECT
[category],
[2013/14],
[2012/13]
FROM (
SELECT
[category],
STR(YEAR(DATEADD(month,-3,[start_date])),4)
+'/'
+RIGHT(STR(YEAR(DATEADD(month,-3,[start_date]))+1,4),2)
AS [fiscal_year],
COUNT(*) AS [count]
FROM #L_LICENCE_FIN
INNER JOIN (VALUES
( 0, 300, 'Minor'),
(300, 600, 'Standard'),
(600,2000, 'Major')
) categories([fee_min], [fee_max], [category])
ON ([fee] >= [fee_min] AND [fee] < [fee_max])
GROUP BY [category],[start_date]
) p1
PIVOT(SUM([count]) FOR [fiscal_year] IN ([2013/14],[2012/13])) p2

How to aggregate 7 days in SQL

I was trying to aggregate a 7 days data for FY13 (starts on 10/1/2012 and ends on 9/30/2013) in SQL Server but so far no luck yet. Could someone please take a look. Below is my example data.
DATE BREAD MILK
10/1/12 1 3
10/2/12 2 4
10/3/12 2 3
10/4/12 0 4
10/5/12 4 0
10/6/12 2 1
10/7/12 1 3
10/8/12 2 4
10/9/12 2 3
10/10/12 0 4
10/11/12 4 0
10/12/12 2 1
10/13/12 2 1
So, my desired output would be like:
DATE BREAD MILK
10/1/12 1 3
10/2/12 2 4
10/3/12 2 3
10/4/12 0 4
10/5/12 4 0
10/6/12 2 1
Total 11 15
10/7/12 1 3
10/8/12 2 4
10/9/12 2 3
10/10/12 0 4
10/11/12 4 0
10/12/12 2 1
10/13/12 2 1
Total 13 16
--------through 9/30/2013
Please note, since FY13 starts on 10/1/2012 and ends on 9/30/2012, the first week of FY13 is 6 days instead of 7 days.
I am using SQL server 2008.
You could add a new computed column for the date values to group them by week and sum the other columns, something like this:
SELECT DATEPART(ww, DATEADD(d,-2,[DATE])) AS WEEK_NO,
SUM(Bread) AS Bread_Total, SUM(Milk) as Milk_Total
FROM YOUR_TABLE
GROUP BY DATEPART(ww, DATEADD(d,-2,[DATE]))
Note: I used DATEADD and subtracted 2 days to set the first day of the week to Monday based on your dates. You can modify this if required.
Use option with GROUP BY ROLLUP operator
SELECT CASE WHEN DATE IS NULL THEN 'Total' ELSE CONVERT(nvarchar(10), DATE, 101) END AS DATE,
SUM(BREAD) AS BREAD, SUM(MILK) AS MILK
FROM dbo.test54
GROUP BY ROLLUP(DATE),(DATENAME(week, DATE))
Demo on SQLFiddle
Result:
DATE BREAD MILK
10/01/2012 1 3
10/02/2012 2 4
10/03/2012 2 3
10/04/2012 0 4
10/05/2012 4 0
10/06/2012 2 1
Total 11 15
10/07/2012 1 3
10/08/2012 4 7
10/10/2012 0 4
10/11/2012 4 0
10/12/2012 2 1
10/13/2012 2 1
Total 13 16
You are looking for a rollup. In this case, you will need at least one more column to group by to do your rollup on, the easiest way to do that is to add a computed column that groups them into weeks by date.
Take a lookg at: Summarizing Data Using ROLLUP
Here is the general idea of how it could be done:
You need a derived column for each row to determine which fiscal week that record belongs to. In general you could subtract that record's date from 10/1, get the number of days that have elapsed, divide by 7, and floor the result.
Then you can GROUP BY that derived column and use the SUM aggregate function.
The biggest wrinkle is that 6 day week you start with. You may have to add some logic to make sure that the weeks start on Sunday or whatever day you use but this should get you started.
The WITH ROLLUP suggestions above can help; you'll need to save the data and transform it as you need.
The biggest thing you'll need to be able to do is identify your weeks properly. If you don't have those loaded into tables already so you can identify them, you can build them on the fly. Here's one way to do that:
CREATE TABLE #fy (fyear int, fstart datetime, fend datetime);
CREATE TABLE #fylist(fyyear int, fydate DATETIME, fyweek int);
INSERT INTO #fy
SELECT 2012, '2011-10-01', '2012-09-30'
UNION ALL
SELECT 2013, '2012-10-01', '2013-09-30';
INSERT INTO #fylist
( fyyear, fydate )
SELECT fyear, DATEADD(DAY, Number, DATEADD(DAY, -1, fy.fstart)) AS fydate
FROM Common.NUMBERS
CROSS APPLY (SELECT * FROM #fy WHERE fyear = 2013) fy
WHERE fy.fend >= DATEADD(DAY, Number, DATEADD(DAY, -1, fy.fstart));
WITH weekcalc AS
(
SELECT DISTINCT DATEPART(YEAR, fydate) yr, DATEPART(week, fydate) dt
FROM #fylist
),
ridcalc AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY yr, dt) AS rid, yr, dt
FROM weekcalc
)
UPDATE #fylist
SET fyweek = rid
FROM #fylist
JOIN ridcalc
ON DATEPART(YEAR, fydate) = yr
AND DATEPART(week, fydate) = dt;
SELECT list.fyyear, list.fyweek, p.[date], COUNT(bread) AS Bread, COUNT(Milk) AS Milk
FROM products p
JOIN #fylist list
ON p.[date] = list.fydate
GROUP BY list.fyyear, list.fyweek, p.[date] WITH ROLLUP;
The Common.Numbers reference above is a simple numbers table that I use for this sort of thing (goes from 1 to 1M). You could also build that on the fly as needed.