Calculating cumulative sum in ms-sql - sql

I have a table tblsumDemo with the following structure
billingid qty Percent_of_qty cumulative
1 10 5 5
2 5 8 13(5+8)
3 12 6 19(13+6)
4 1 10 29(19+10)
5 2 11 40(11+10)
this is what I have tried
declare #s int
SELECT billingid, qty, Percent_of_qty,
#s = #s + Percent_of_qty AS cumulative
FROM tblsumDemo
CROSS JOIN (SELECT #s = 0) AS var
ORDER BY billingid
but I'm not able to get the desired output,any help would be much appreciated , Thanks

You can use CROSS APPLY:
SELECT
t1.*,
x.cumulative
FROM tblSumDemo t1
CROSS APPLY(
SELECT
cumulative = SUM(t2.Percent_of_Qty)
FROM tblSumDemo t2
WHERE t2.billingid <= t1.billingid
)x
For SQL Server 2012+, you can use SUM OVER():
SELECT *,
cummulative = SUM(Percent_of_Qty) OVER(ORDER BY billingId)
FROM tblSumDemo

You can use subquery which works in all versions:
select billingid,qty,percentofqty,
(select sum(qty) from tblsumdemo t2 where t1.id<=t2.id) as csum
from
tblsumdemo t1
you can use windows functions as well from sql 2012:
select *,
sum(qty) over (order by qty rows between unbounded PRECEDING and current row) as csum
from tblsumdemo
Here i am saying get me sum of all rows starting from first row for every row(unbounded preceeding and current row).you can ignore unbounded preceeding and current row which is default

Use ROW_NUMBER just to order the billingID in ascending order, then Use join.
Query
;with cte as(
select rn = row_number() over(
order by billingid
), *
from tblSumDemo
)
select t1.billingid, t1.qty, t1.Percent_of_qty,
sum(t2.Percent_of_qty) as cummulative
from cte t1
join cte t2
on t1.rn >= t2.rn
group by t1.billingid, t1.qty, t1.Percent_of_qty;

Related

Next Three month Rolling sum of sales

Im struggling in script logic below is my data set and want to sum the below data based on next three months
Your title describes a rolling sum. Your sample data is simply taking a maximum. The following does both.
Hmmm . . . I think row_number() can be a big help here:
with t as (
select t.*, row_number() over (order by yearmonth) as seqnum
from yourtable t
)
select t1.yearmonth, max(t2.yearmonth) as last_mont_max, sum(t2.sales) as rolling_sum,
max(t2.sales) as value
from t t1 cross join
(values (1), (2), (3)) n(n) join
t t2
on t2.seqnum >= t1.seqnum and
t2.seqnum <= t1.seqnum + n.n
group by t1.yearmonth, n.n;
Here is a db<>fiddle
If you just wanted the rolling sum on each row (not what your results show), then it is much simpler:
select t.*,
sum(sales) over (order by yearmonth
rows between current row and 2 following)
) as value
from yourtable t;
I think you need to write your query like following.
;WITH CTE
AS (
SELECT DATEFROMPARTS(CAST(SUBSTRING(T1.YearMonth, 1, 4) AS INT),
CAST(SUBSTRING(T1.YearMonth, 5, 2) AS INT), 1) YearMonthDt
,Sales
,YearMonth
FROM #table T1
)
SELECT T1.YearMonth
,T2.YearMonth
,T2.sales
FROM CTE T1
INNER JOIN CTE T2 ON DATEDIFF(mm, T2.YearMonthDt, T1.YearMonthDt) <= 3
ORDER BY T1.YearMonth
Working Demo

Random records in Oracle table based on conditions

I have a Oracle table with the following columns
Table Structure
In a query I need to return all the records with CPER>=40 which is trivial. However, apart from CPER>=40 I need to list 5 random records for each CPID.
I have attached a sample list of records. However, in my table I have around 50,000 records.
Appreciate if you can help.
Oracle solution:
with CTE as
(
select t1.*,
row_number() over(order by DBMS_RANDOM.VALUE) as rn -- random order assigned
from MyTable t1
where CPID <40
)
select *
from CTE
where rn <=5 -- pick 5 at random
union all
select t2.*, null
from my_table t2
where CPID >= 40
SQL Server:
with CTE as
(
select t1.*,
row_number() over(order by newid()) as rn -- random order assigned
from MyTable t1
where CPID <40
)
select *
from CTE
where rn <=5 -- pick 5 at random
union all
select t2.*, null
from my_table t2
where CPID >= 40
How about something like this...
SELECT *
FROM (SELECT CID,
CVAL,
CPID,
CPER,
Row_number() OVER (partition BY CPID ORDER BY CPID ASC ) AS RN
FROM Table) tmp
WHERE CPER>=40 OR pids <= 5
However, this is not random.
Assuming that you want five additional random records, you can do:
select t.*
from (select t.*,
row_number() over (partition by cpid,
(case when cper >= 40 then 1 else 2 end)
order by dbms_random.value
) as seqnum
from t
) t
where seqnum <= 5 or cper >= 40;
The row_number() is enumerating the rows for each cpid in two groups -- based on the cper value. The outer where is taking all cper values in the range you want as well as five from the other group.

Cumulative multiplication with window functions, 'exp' is not a valid windowing function

Is it possible doing cumulative multiply(below query) with window functions
select Id, Qty
into #temp
from(
select 1 Id, 5 Qty
union
select 2, 6
union
select 3, 3
)dvt
select
t1.Id
,exp(sum(log( t2.Qty))) CumulativeMultiply
from #temp t1
inner join #temp t2
on t2.Id <= t1.Id
group
by t1.Id
order
by t1.Id
Like:
select
t1.Id
,exp(sum(log( t2.Qty))) over (partition by t1.Id order by t1.Id rows between unbounded preceding and current row ) CumulativeMultiply
from #temp t1
inner join #temp t2
on t2.Id <= t1.Id
But get error:
The function 'exp' is not a valid windowing function, and cannot be used with the OVER clause
Update:
Result that actually I want:
Id CumulativeMultiply
----------- ----------------------
1 5
2 30
3 90
no need of self join for Sum Over(Order by) to find the previous records and multiply it
select
Id
,exp(sum(log( Qty))
over (order by Id )) CumulativeMultiply from #temp
Only aggregation function are valid windowing functions.
I didn't test the code, but you need to separate the 2 in a way:
SELECT Id, exp(cm) CumulativeMultiply
FROM (
select
Id
,sum(log(Qty)) over (partition by Id order by Id rows between unbounded preceding and current row ) cm
from #temp
) d

SQL stored procedure to add up values and stop once the maximum has been reached

I would like to write a SQL query (SQL Server) that will return rows (in a given order) but only up to a given total. My client has paid me a given amount, and I want to return only those rows that are <= to that amount.
For example, if the client paid me $370, and the data in the table is
id amount
1 100
2 122
3 134
4 23
5 200
then I would like to return only rows 1, 2 and 3
This needs to be efficient, since there will be thousands of rows, so a for loop would not be ideal, I guess. Or is SQL Server efficient enough to optimise a stored proc with for loops?
Thanks in advance. Jim.
A couple of options are.
1) Triangular Join
SELECT *
FROM YourTable Y1
WHERE (SELECT SUM(amount)
FROM YourTable Y2
WHERE Y1.id >= Y2.id ) <= 370
2) Recursive CTE
WITH RecursiveCTE
AS (
SELECT TOP 1 id, amount, CAST(amount AS BIGINT) AS Total
FROM YourTable
ORDER BY id
UNION ALL
SELECT R.id, R.amount, R.Total
FROM (
SELECT T.*,
T.amount + Total AS Total,
rn = ROW_NUMBER() OVER (ORDER BY T.id)
FROM YourTable T
JOIN RecursiveCTE R
ON R.id < T.id
) R
WHERE R.rn = 1 AND Total <= 370
)
SELECT id, amount, Total
FROM RecursiveCTE
OPTION (MAXRECURSION 0);
The 2nd one will likely perform better.
In SQL Server 2012 you will be able to so something like
;WITH CTE AS
(
SELECT id,
amount,
SUM(amount) OVER(ORDER BY id
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS RunningTotal
FROM YourTable
)
SELECT *
FROM CTE
WHERE RunningTotal <=370
Though there will probably be a more efficient way (to stop the scan as soon as the total is reached)
Straight-forward approach :
SELECT a.id, a.amount
FROM table1 a
INNER JOIN table1 b ON (b.id <=a.id)
GROUP BY a.id, a.amount
HAVING SUM(b.amount) <= 370
Unfortunately, it has N^2 performance issue.
something like this:
select id from
(
select t1.id, t1.amount, sum( t2.amount ) s
from tst t1, tst t2
where t2.id <= t1.id
group by t1.id, t1.amount
)
where s < 370

Compute median of column in SQL common table expression

In MSSQL2008, I am trying to compute the median of a column of numbers from a common table expression using the classic median query as follows:
WITH cte AS
(
SELECT number
FROM table
)
SELECT cte.*,
(SELECT
(SELECT (
(SELECT TOP 1 cte.number
FROM
(SELECT TOP 50 PERCENT cte.number
FROM cte
ORDER BY cte.number) AS medianSubquery1
ORDER BY cte.number DESC)
+
(SELECT TOP 1 cte.number
FROM
(SELECT TOP 50 PERCENT cte.number
FROM cte
ORDER BY cte.number DESC) AS medianSubquery2
ORDER BY cte.number ASC) ) / 2)) AS median
FROM cte
ORDER BY cte.number
The result set that I get is the following:
NUMBER MEDIAN
x1 x1
x1 x1
x1 x1
x2 x2
x3 x3
In other words, the "median" column is the same as the "number" column when I would expect the median column to be "x1" all the way down. I use a similar expression to compute the mode and it works fine over the same common table expression.
Here's a slightly different way to do it:
WITH cte AS
(
SELECT number
FROM table1
)
SELECT T1.number, T3.median
FROM cte T1,
(
SELECT AVG(number) AS median
FROM
(
SELECT number, ROW_NUMBER() OVER(ORDER BY number) AS rn
FROM cte
) T2
WHERE T2.rn = ((SELECT COUNT(*) FROM table1) + 1) / 2
OR T2.rn = ((SELECT COUNT(*) FROM table1) + 2) / 2
) T3
The problem with your query is that you are doing
SELECT TOP 1 cte.number FROM...
but it isn't correlated with the sub query it is correlated with the Outer query so the subquery is irrelevant. Which explains why you simply end up with the same value all the way down. Removing the cte. (as below) gives the median of the CTE. Which is a constant value. What are you trying to do?
WITH cte AS
( SELECT NUMBER
FROM master.dbo.spt_values
WHERE TYPE='p'
)
SELECT cte.*,
(SELECT
(SELECT (
(SELECT TOP 1 number
FROM
(SELECT TOP 50 PERCENT cte.number
FROM cte
ORDER BY cte.number) AS medianSubquery1
ORDER BY number DESC)
+
(SELECT TOP 1 number
FROM
(SELECT TOP 50 PERCENT cte.number
FROM cte
ORDER BY cte.number DESC) AS medianSubquery2
ORDER BY number ASC) ) / 2)) AS median
FROM cte
ORDER BY cte.number
Returns
NUMBER median
----------- -----------
0 1023
1 1023
2 1023
3 1023
4 1023
5 1023
6 1023
7 1023
This is not an entirely new answer as it mostly expands on Mark Byer's answer, but there are a couple of options for simplifying the query even further.
The first thing is to really make use of CTE's. Not only can you have multiple CTE's, but they can refer to each other. With this in mind, we can create an additional CTE to compute the median based on the results of the first. This encapsulates the median computation and leaves the actual SELECT to do only what it needs to do. Note that the ROW_NUMBER() had to be moved into the first CTE.
;WITH cte AS
(
SELECT number, ROW_NUMBER() OVER(ORDER BY number) AS rn
FROM table1
),
med AS
(
SELECT AVG(number) AS median
FROM cte
WHERE cte.rn = ((SELECT COUNT(*) FROM cte) + 1) / 2
OR cte.rn = ((SELECT COUNT(*) FROM cte) + 2) / 2
)
SELECT cte.number, med.median
FROM cte
CROSS JOIN med
And to further reduce complexity, you "could" use a custom CLR Aggregate to handle the Median (such as the one provided in the free SQL# library at http://www.SQLsharp.com/ [which I am the author of]).
;WITH cte AS
(
SELECT number
FROM table1
),
med AS
(
SELECT SQL#.Agg_Median(cte.number) AS median
FROM cte
)
SELECT cte.number, med.median
FROM cte
CROSS JOIN med