Running Total Minus - sql

I am trying to calculate a minus running total in SQL with this code but it is not giving me the expected result. After the date 01/2021, I would like to minus the sales for each month.
select Name, Date, Sales, MinusRunningTotal = B.Sales - SUM(A.sales)
OVER (PARTITION BY Name ORDER BY Date)
From TableA A
Join TableB B on A.ID = B.ID
Where Date > '01/2021'
This is how the data is displayed
Name Date Sales
A 01/2021 10
A 02/2021 1
A 03/2021 2
A 04/2021 3
This is what I want to achieve
Name Date Sales MinusRunningTotal
A 01/2021 10 10
A 02/2021 1 9
A 03/2021 2 7
A 04/2021 3 4

If that data already exists in a table with name, date, and sales columns, try:
SELECT [name],
[date],
[sales],
FIRST_VALUE([sales]) OVER (PARTITION BY [name]
ORDER BY [date]) * 2
- SUM([sales]) OVER (PARTITION BY [name]
ORDER BY [date]
ROWS UNBOUNDED PRECEDING) AS minus_running_total
FROM my_table
sql fiddle
This computes all preceding sales for the current name (including current value)
`SUM([sales]) OVER (PARTITION BY [name]
ORDER BY [date]`)
This computes the first chronological value for the current name, X 2:
`FIRST_VALUE([sales]) OVER (PARTITION BY [name]
ORDER BY [date]) * 2`
So the first row computes as (10 x 2) - 10 = 10
Second row is (10 x 2) - (10 + 1) = 9
Third row is (10 x 2) - (10 + 1 + 2) = 7
etc

select Name, Date, Sales,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) as RN into
#temp2 From table
select Name, Date, Sales,
case when rn != 1
then (select sale from #temp2 where RN = 1) - sum(sales)
OVER (PARTITION BY Name ORDER BY Date
rows between UNBOUNDED PRECEDING and current row)
else sales end
From #temp2

I made a table sss the contains the data you show in your second code section. This query will give the results you want. If you need to join in other data, do it in the CTE called xs below.
with
xs as (
select name, dt, sales, sales tot,
ROW_NUMBER() over (partition by name order by dt) n
from sss
),
rec as (
select * from xs where n = 1
union all
select xs.name, xs.dt, xs.sales, rec.tot - xs.sales, xs.n
from rec
join xs on rec.name = xs.name
and xs.n = rec.n + 1
)
select *
from rec
How it works:
In the CTE (common table expression) xs, we number the rows associated with a given name in ascending dt order.
CTE rec is a recursive query that begins by fetching the first row of the group associated with a name via filtering with "n = 1". That becomes the first row of the output. The second part of rec fetches succeeding rows where n of the new row equals n + 1 of the previous row. The desired running total, kept in column tot, is gotten by subtracting the new row's sales from the previous rows tot.

Related

SUM a specific column in next rows until a condition is true

Here is a table of articles and I want to store sum of Mass Column from next rows in sumNext Column based on a condition.
If next row has same floor (in floorNo column) as current row, then add the mass of next rows until the floor is changed
E.g : Rows three has sumNext = 2. That is computed by adding the mass from row four and row five because both rows has same floor number as row three.
id
mass
symbol
floorNo
sumNext
2891176
1
D
1
0
2891177
1
L
8
0
2891178
1
L
1
2
2891179
1
L
1
1
2891180
1
1
0
2891181
1
5
2
2891182
1
5
1
2891183
1
5
0
Here is the query, that is generating this table, I just want to add sumNext column with the right value inside.
WITH items AS (SELECT
SP.id,
SP.mass,
SP.symbol,
SP.floorNo
FROM articles SP
ORDER BY
DECODE(SP.symbol,
'P',1,
'D',2,
'L',3,
4 ) asc)
SELECT CLS.*
FROM items CLS;
You could use below solution which uses
common table expression (cte) technique to put all consecutive rows with same FLOORNO value in the same group (new grp column).
Then uses the analytic version of SUM function to sum all next MASS per grp column as required.
Items_RowsNumbered (id, mass, symbol, floorNo, rnb) as (
select ID, MASS, SYMBOL, FLOORNO
, row_number()over(
order by DECODE(symbol, 'P',1, 'D',2, 'L',3, 4 ) asc, ID )
/*
You need to add ID column (or any others columns that can identify each row uniquely)
in the "order by" clause to make the result deterministic
*/
from (Your source query)Items
)
, cte(id, mass, symbol, floorNo, rnb, grp) as (
select id, mass, symbol, floorNo, rnb, 1 grp
from Items_RowsNumbered
where rnb = 1
union all
select t.id, t.mass, t.symbol, t.floorNo, t.rnb
, case when t.floorNo = c.floorNo then c.grp else c.grp + 1 end grp
from Items_RowsNumbered t
join cte c on (c.rnb + 1 = t.rnb)
)
select
ID, MASS, SYMBOL, FLOORNO
/*, RNB, GRP*/
, nvl(
sum(MASS)over(
partition by grp
order by rnb
ROWS BETWEEN 1 FOLLOWING and UNBOUNDED FOLLOWING)
, 0
) sumNext
from cte
;
demo on db<>fiddle
This is a typical gaps-and-islands problem. You can use LAG() in order to determine the exact partitions, and then SUM() analytic function such as
WITH ii AS
(
SELECT i.*,
ROW_NUMBER() OVER (ORDER BY id DESC) AS rn2,
ROW_NUMBER() OVER (PARTITION BY floorNo ORDER BY id DESC) AS rn1
FROM items i
)
SELECT id,mass,symbol, floorNo,
SUM(mass) OVER (PARTITION BY rn2-rn1 ORDER BY id DESC)-1 AS sumNext
FROM ii
ORDER BY id
Demo

Per year one maximum date row according to previous row date

I have a table having two columns and I want to fetch data of 6 years with rules
The first row would be maximum date row that is available before and equals to input date (I will pass an input date)
From the second row till 6th row I need maximum(date row) that is earlier than previous row data selected data and there should not be 2 rows for same year i need only latest one according to the previous row but not in same year.
declare #tbl table (id int identity, marketdate date )
insert into #tbl (marketdate)
values('2018-05-31'),
('2017-06-01'),
('2017-05-28'),
('2017-04-28'),
('2016-05-26'),
('2015-04-18'),
('2015-04-20'),
('2015-03-18'),
('2014-05-31'),
('2014-04-18'),
('2013-04-15')
output:
id marketdate
1 2018.05.31
3 2017.05.28
5 2016.05.27
7 2015.04.20
9 2014.04.18
10 2013.04.15
Can't you do this with a simple order by/desc?
SELECT TOP 6 id, max(marketdate) FROM tbl
WHERE tbl.marketdate <= #date
GROUP BY YEAR(marketdate), id, marketdate
ORDER BY YEAR(marketdate) DESC
Based purely on your "Output" given your sample data, I believe the following is what you are after (The max date for each distinct year of data):
SELECT TOP 6
max(marketdate),
Year(marketDate) as marketyear
FROM #tbl
WHERE #tbl.marketdate <= getdate()
GROUP BY YEAR(marketdate)
ORDER BY YEAR(marketdate) DESC;
SQLFiddle of this matching your output
you can use row_number if you are using sql server
select top 6
id
, t.marketdate
from ( select rn = row_number() over (partition by year(marketdate)order by marketdate desc)
, id
, marketdate
from #tbl) as t
where t.rn = 1
order by t.marketdate desc
The following recursively searches for the next date, which must be at least one year earlier than the previous date.
Your parameterised start position goes where I chose 2018-06-01.
WITH
recursiveSearch AS
(
SELECT
id,
marketDate
FROM
(
SELECT
yourTable.id,
yourTable.marketDate,
ROW_NUMBER() OVER (ORDER BY yourTable.marketDate DESC) AS relative_position
FROM
yourTable
WHERE
yourTable.marketDate <= '2018-06-01'
)
search
WHERE
relative_position = 1
UNION ALL
SELECT
id,
marketDate
FROM
(
SELECT
yourTable.id,
yourTable.marketDate,
ROW_NUMBER() OVER (ORDER BY yourTable.marketDate DESC) AS relative_position
FROM
yourTable
INNER JOIN
recursiveSearch
ON yourTable.marketDate < DATEADD(YEAR, -1, recursiveSearch.marketDate)
)
search
WHERE
relative_position = 1
)
SELECT
*
FROM
recursiveSearch
WHERE
id IS NOT NULL
ORDER BY
recursiveSearch.marketDate DESC
OPTION
(MAXRECURSION 0)
http://sqlfiddle.com/#!18/56246/13

Value for a column is the sum of the next 4 values - SQL

ITEM LOCATION QTY WEEK
A X 30 1
A X 35 2
A X 40 3
A X 0 4
A X 10 5
A X 19 6
I need to create a new column with the computation like..
ITEM LOCATION QTY WEEK NEW_COLUMN
A X 30 1 AVG(WEEK2(qty)+WEEK3(qty)+WEEK4(qty)+WEEK5(qty))
A X 35 2 AVG(WEEK3(qty)+WEEK4(qty)+WEEK5(qty)+WEEK6(qty))
similarly for all the rows....
the average of 4 weeks is fixed,it wont change.
The first week will have the average of next 4 weeks i.e., 2,3,4 and 5 avg(35+40+0+10)
The 2nd week will have the average of next 4 weeks i.e., 3,4,5 and 6
avg(40+0+10+19).
I tried to create to bucket them based on the week number,say
Week 1-4 as 1
Week 5-8 as 2.
and tried to do the process,but i am getting the same avg for the each buckets,say same value for 1,2,3,4 line items..
Joining to the same table with a clause restricting the Weeks to be within your range should work. You'll have to decide what the right answer is for the last weeks (which won't have 4 weeks afterwards) and either COALESCE the right answer or INNER JOIN them out.
SELECT T.Item, T.Location, T.Week, AVG(N.Qty) as New_Column
FROM Table T
LEFT OUTER JOIN Table N ON
T.Item = N.Item
AND T.Location = N.Location
AND N.Week BETWEEN (T.Week + 1) AND (T.Week + 4)
GROUP BY T.Item, T.Location, T.Week
Some of the other answers work fine, but with 2012 it should be really easy:
SELECT *,New_Column = (SUM(Qty) OVER(ORDER BY Week ROWS BETWEEN 1 FOLLOWING AND 5 FOLLOWING)*1.0)/4
FROM Table1
Demo: SQL Fiddle
If it's by item and location then just add PARTITION BY:
SELECT *,New_Column = (SUM(Qty) OVER(PARTITION BY Item, Location ORDER BY Week ROWS BETWEEN 1 FOLLOWING AND 5 FOLLOWING)*1.0)/4
FROM Table1
To filter out records that don't have 4 subsequent records, you could use LEAD() for filtering:
;with cte AS ( SELECT *,New_Column = (SUM(Qty) OVER(PARTITION BY Item, Location ORDER BY Week ROWS BETWEEN 1 FOLLOWING AND 5 FOLLOWING)*1.0)/4
,Lead4Col = LEAD(week,5) OVER(PARTITION BY Item,Location ORDER BY Week)
FROM Table1
)
SELECT *
FROM cte
WHERE Lead4Col IS NOT NULL
You could also use COUNT(Qty) OVER(PARTITION BY Item, Location ORDER BY Week ROWS BETWEEN 1 FOLLOWING AND 5 FOLLOWING) instead of LEAD() to do your filtering to when 4 subsequent weeks exist.
Edit: I think you actually want to exclude this week from the calculation, so adjusted slightly.
You can self-join to the same table 4 times:
select t0.item, t0.location, t0.qty, t0.week,
(t1.qty + t2.qty + t3.qty + t4.qty) / 4.0
from [table] t0
left join [table] t1 on t0.item = t1.item and t0.location = t1.location
and t1.week = t0.week + 1
left join [table] t2 on t0.item = t2.item and t0.location = t2.location
and t2.week = t0.week + 2
left join [table] t3 on t0.item = t3.item and t0.location = t3.location
and t3.week = t0.week + 3
left join [table] t4 on t0.item = t4.item and t0.location = t4.location
and t4.week = t0.week + 4
You can simplify those joins if you have a better key available for the table.
Try this query:
SELECT
T1.ITEM,
T1.LOCATION,
T1.WEEK,
MAX(T1.QUANTITY) AS QUANTITY,
AVG(T2.QUANTITY) AS NEW_COLUMN
FROM TBL t1 LEFT JOIN TBL t2
ON
T1.ITEM=T2.ITEM AND T1.LOCATION=T2.LOCATION
AND T2.WEEKNUMBER >T1.WEEK AND T2.WEEKNUMBER<T1.WEEK+5
GROUP BY t1.ITEM, t1.LOCATION, T1.WEEK
Almost the same as earlier, but insted SUM()/4 better to use AVG
Also I use *1.0 to make decimal value from qty, cause if it's integer - you'll lost fraction part after AVG operation.
SELECT *,
new_column = ( Avg(qty * 1.0)
over(
PARTITION BY item, location
ORDER BY week ROWS BETWEEN 1 following AND 5 following
)
)
FROM table1
with x as
(select *, lead(qty) over(partition by item order by week) as next_1 from tablename)
, y as
(select *, lead(qty) over(partition by item order by week) as next_2 from x)
, z as
(select *, lead(qty) over(partition by item order by week) as next_3 from y)
, w as
(select *, lead(qty) over(partition by item order by week) as next_4 from z)
select item, location, qty, week, (next_1+next_2+next_3+next_4)/4 as new_column from w
This uses recursive cte's. lead function selects the next row's qty value. When you go from the first cte to the fourth, a new column gets added each time, so you will have all the next 4 week's values at the end. Then you just take the average.

SQL - pull unique name with the lastest date and lowest value

how do i get unique name with the latest date and lowest value.
Name date value
brad 1/2/10 1.1
brad 1/2/10 2.3
bob 1/6/10 1.0
brad 2/4/09 13.2
this query does not seem to work
SELECT distinct
A.[ViralLoadMemberID]
,B.LastName
,B.FirstName
,A.[Date]
,A.[vaule]
FROM [t].[dbo].[tblViralLoad] A
left join [dbo].[tblEnrollees] B on A.ViralLoadMemberID = B.MemberID
where
A.Date =
(
select MAX(Date)
from dbo.tblViralLoad
where ViralLoadMemberID = A.ViralLoadMemberID
and
( Date >= '07/01/2014'
and Date <= '12/3/2014' ) )
The idea is to use order by and fetch only one row. If you want the lowest value on the latest date, the standard SQL would be:
select t.*
from table t
order by desc desc, value asc
fetch first 1 row only;
For older versions of SQL Server, you would omit the last line and do select top 1 * . . .. For MySQL, the last line would be limit 1.
Fun with rank()
declare #t as table (name varchar(50),dte date,val decimal(18,10));
insert into #t(name,dte,val) values
('Dave','1/1/2015',1.0),
('Dave','1/3/2015',1.2),
('Dave','1/4/2015',1.5),
('Dave','1/10/2015',1.3),
('Dave','1/15/2015',1.2),
('Steve','1/11/2015',1.6),
('Steve','1/12/2015',1.1),
('Steve','1/15/2015',1.2),
('Bill','1/21/2015',1.9),
('Ted','1/1/2015',1.8),
('Ted','1/10/2015',1.0),
('Ted','1/12/2015',1.7)
-- This will show the lowest prices by each person
select name,dte,val from (select name,dte,val, rank() over (partition by name order by val) as r from #t) as data where r = 1
-- This will be users lowest price and the last day they sublitted a prices regurdless if it is the lowest
select name,max(dte) as [last Date] ,min(val) as [Lowest Value] from #t group by name
-- Who had the lowest price last regurdless if they have raised there price later.
select top(1) name,dte [last lowest quote],val from (select name,dte,val, rank() over (order by val) as r from #t) as data where r = 1 order by dte desc
-- what is the lowest price cueently quoted reguarless who quoted it
select top(1) name,dte [best active quote],val from (select name,dte,val, rank() over (partition by name order by dte desc) as r from #t) as data where r = 1 order by val

Oracle SQL query : finding the last time a data was changed

I want to retrieve elapsed days since the last time the data of the specific column was changed, for example :
TABLE_X contains
ID PDATE DATA1 DATA2
A 10-Jan-2013 5 10
A 9-Jan-2013 5 10
A 8-Jan-2013 5 11
A 7-Jan-2013 5 11
A 6-Jan-2013 14 12
A 5-Jan-2013 14 12
B 10-Jan-2013 3 15
B 9-Jan-2013 3 15
B 8-Jan-2013 9 15
B 7-Jan-2013 9 15
B 6-Jan-2013 14 15
B 5-Jan-2013 14 8
I simplify the table for example purpose.
The result should be :
ID DATA1_LASTUPDATE DATA2_LASTUPDATE
A 4 2
B 2 5
which says,
- data1 of A last update is 4 days ago,
- data2 of A last update is 2 days ago,
- data1 of B last update is 2 days ago,
- data2 of B last update is 5 days ago.
Using query below is OK but it takes too long to complete if I apply it to the real table which have lots of records and add 2 more data columns to find their latest update days.
I use LEAD function for this purposes.
Any other alternatives to speed up the query?
with qdata1 as
(
select ID, pdate from
(
select a.*, row_number() over (partition by ID order by pdate desc) rnum from
(
select a.*,
lead(data1,1,0) over (partition by ID order by pdate desc) - data1 as data1_diff
from table_x a
) a
where data1_diff <> 0
)
where rnum=1
),
qdata2 as
(
select ID, pdate from
(
select a.*, row_number() over (partition by ID order by pdate desc) rnum from
(
select a.*,
lead(data2,1,0) over (partition by ID order by pdate desc) - data2 as data2_diff
from table_x a
) a
where data2_diff <> 0
)
where rnum=1
)
select a.ID,
trunc(sysdate) - b.pdate data1_lastupdate,
trunc(sysdate) - c.pdate data2_lastupdate,
from table_master a, qdata1 b, qdata2 c
where a.ID=b.ID(+) and a.ID=b.ID(+)
and a.ID=c.ID(+) and a.ID=c.ID(+)
Thanks a lot.
You can avoid the multiple hits on the table and the joins by doing both lag (or lead) calculations together:
with t as (
select id, pdate, data1, data2,
lag(data1) over (partition by id order by pdate) as lag_data1,
lag(data2) over (partition by id order by pdate) as lag_data2
from table_x
),
u as (
select t.*,
case when lag_data1 is null or lag_data1 != data1 then pdate end as pdate1,
case when lag_data2 is null or lag_data2 != data2 then pdate end as pdate2
from t
),
v as (
select u.*,
rank() over (partition by id order by pdate1 desc nulls last) as rn1,
rank() over (partition by id order by pdate2 desc nulls last) as rn2
from u
)
select v.id,
max(trunc(sysdate) - (case when rn1 = 1 then pdate1 end))
as data1_last_update,
max(trunc(sysdate) - (case when rn2 = 1 then pdate2 end))
as data2_last_update
from v
group by v.id
order by v.id;
I'm assuming that you meant your data to be for Jun-2014, not Jan-2013; and that you're comparing the most recent change dates with the current date. With the data adjusted to use 10-Jun-2014 etc., this gives:
ID DATA1_LAST_UPDATE DATA2_LAST_UPDATE
-- ----------------- -----------------
A 4 2
B 2 5
The first CTE (t) gets the actual table data and adds two extra columns, one for each of the data columns, using lag (whic his the the same as lead ordered by descending dates).
The second CTE (u) adds two date columns that are only set when the data columns are changed (or when they are first set, just in case they have never changed). So if a row has data1 the same as the previous row, its pdate1 will be blank. You could combine the first two by repeating the lag calculation but I've left it split out to make it a bit clearer.
The third CTE (v) assigns a ranking to those pdate columns such that the most recent is ranked first.
And the final query works out the difference from the current date to the highest-ranked (i.e. most recent) change for each of the data columns.
SQL Fiddle, including all the CTEs run individually so you can see what they are doing.
Your query wasn't returning the right results for me, maybe I missed something, but I got the correct results also with the below query (you can check this SQLFiddle demo):
with ranked as (
select ID,
data1,
data2,
rank() over(partition by id order by pdate desc) r
from table_x
)
select id,
sum(DATA1_LASTUPDATE) DATA1_LASTUPDATE,
sum(DATA2_LASTUPDATE) DATA2_LASTUPDATE
from (
-- here I get when data1 was updated
select id,
count(1) DATA1_LASTUPDATE,
0 DATA2_LASTUPDATE
from ranked
start with r = 1
CONNECT BY (PRIOR data1 = data1)
and PRIOR r = r - 1
group by id
union
-- here I get when data2 was updated
select id,
0 DATA1_LASTUPDATE,
count(1) DATA0_LASTUPDATE
from ranked
start with r = 1
CONNECT BY (PRIOR data2 = data2)
and PRIOR r = r - 1
group by id
)
group by id