Last_Value returning the current value - sql

I am not able to get the last value, rather it is just returning the same value with my code below in snowflake - does anyone have any idea? Is there something glaring wrong?
select MNTH,
sum_cust,
last_value(sum_cust) over (partition by MNTH order by sum_cust desc ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as sum_cust_last
from block_2;

I think what you actually want is to LAG the value from the previous MNTH:
SELECT MNTH,
sum_cust,
LAG(sum_cust) OVER (ORDER BY MNTH) AS sum_cust_last
FROM block_2;

I actually recommend first_value() rather than last_value() for some technical reasons involving window frames. If you want the last value, order by the month desc and choose the first row:
select MNTH, sum_cust,
first_value(sum_cust) over (order by MNTH desc
rows between current_row AND UNBOUNDED FOLLOWING
) as sum_cust_last
from block_2;

Related

How can I specify window frame for my sum?

How/Can I specify this window frame:
sum(Quantity) over (partition by AccountId, SymbolId order by Time rows between unbounded preceding and current row -1) PositionAmount?
I tried to full it by
sum(Quantity) over (partition by AccountId, SymbolId order by Time rows between unbounded preceding and -1 following)
but -1 is not allowed.
I can of course make a second select over it and find prev value of PositionAmount with lag or something.
The documention specifies, in case you use between, that both parts are window frame bound, without forcing you to use following for the second part. Try this:
rows between unbounded preceding and 1 preceding

Taking an Error while trying to find LAST_VALUE() in impala

I am trying to find the last blnc value of each id but it throws me an error:
AnalysisException: select list expression not produced by aggregation
output (missing from GROUP BY clause?): last_value(blnc) OVER
(PARTITION BY id ORDER BY id date ASC ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) lasted.
SELECT id, number, type,
LAST_VALUE(blnc) OVER (PARTITION BY id ORDER BY date rows between unbounded preceding and unbounded following ) AS lasted ,
to_timestamp(MAX(date),'yyyyMMdd') as end_date,
concat(substr(date,1,6),"01") as start_date,
substr(date,1,6) as id_month
FROM table
GROUP BY id,number,type,concat(substr(date,1,6),"01"),substr(date,1,6)
I put all the LAST_VALUE() statement in the group by also but another error occurs.
The problem is that your expression:
LAST_VALUE(blnc) OVER (PARTITION BY id
ORDER BY date
rows between unbounded preceding and unbounded following
) AS lasted ,
is scoped to run after the aggregation. So, only expressions that are understood after the aggregation are valid. And there is no date or blnc. You can fix this by using aggregation functions:
LAST_VALUE(MAX(blnc)) OVER (PARTITION BY id
ORDER BY MAX(date)
rows between unbounded preceding and unbounded following
) AS lasted ,
Although this answers your question and fixes the syntax error, it probably doesn't do anything useful. I think you want conditional aggregation. You haven't explained the logic you want or provided sample data, but the idea is:
SELECT id, number, type,
to_timestamp(MAX(date), 'yyyyMMdd') as end_date,
concat(substr(date,1,6),"01") as start_date,
substr(date, 1, 6) as id_month,
MAX(CASE WHEN seqnum = 1 THEN blnc END) as lasted
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id, number, type, concat(substr(date, 1, 6), '01'), substr(date,1,6)
ORDER BY date DESC
) as seqnum
FROM table t
) t
GROUP BY id, number, type, concat(substr(date, 1, 6), '01'), substr(date,1,6)
Note: String operations on dates look wrong. You should be using the built-in date/time functions, if the column is stored correctly.

Last_Value in SQL Server

with cte
as
(
SELECT
year(h.orderdate)*100+month(h.orderdate) as yearmonth,
YEAR(h.orderdate) as orderyear,
sum(d.OrderQty*d.UnitPrice) as amount
FROM [AdventureWorks].[Sales].[SalesOrderDetail] d
inner join sales.SalesOrderHeader h
on d.SalesOrderID=h.SalesOrderID
group by
year(h.orderdate)*100+month(h.orderdate),
year(h.orderdate)
)
select
c.*,
last_value(c.amount) over (partition by c.orderyear order by c.yearmonth) as lastvalue,
first_value(c.amount) over (partition by c.orderyear order by c.yearmonth) as firstvalue
from cte c
order by c.yearmonth
I am expecting to see the lastvalue of each year (say december value), similar to the firstvalue of each year (jan value). however, last_value is not working at all. It just returns the same value of that month. What did I do wrong?
Thanks for the help.
Your problem is that the default row range for LAST_VALUE is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, so the value you are getting is the current month's value (that being the last value in that range). To get LAST_VALUE to look at all values in the partition you need to expand the range to include the rows after the current row as well. So you need to change your query to:
last_value(c.amount) over (partition by c.orderyear order by c.yearmonth
rows between unbounded preceding and unbounded following) as lastvalue,

Use a regular aggregative function (sum) alongside a window function

I was reading this tutorial on how to calculate running totals.
Copying the suggested approach I have a query of the form:
select
date,
sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table;
This works fine and does what I want - a running total by date.
However, in addition to the running total, I'd also like to add daily sales:
select
date,
sum(sales),
sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table
group by 1;
This throws an error:
SYNTAX_ERROR: line 6:8: '"sum"("sales") OVER (ORDER BY "activity_date" ASC ROWS UNBOUNDED PRECEDING)' must be an aggregate expression or appear in GROUP BY clause
How can I calculate both daily total as well as running total?
I think you can try it, but it will repeat your daily_sales. In this way you don't need to group by your date field.
SELECT date,
SUM(sales) OVER (PARTITION BY DATE) as daily_sales
SUM(sales) OVER (ORDER BY DATE ROWS UNBOUNDED PRECEDING) as cumulative_sales
FROM sales_table;
Presumably, you intend an aggregation query to begin with:
select date, sum(sales) as daily_sales,
sum(sum(sales)) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table
group by date
order by date;

SQL Server : PRECEDING with another condition

I have a query that is working fine: The query is to find the sum & Avg for the last 3 months and last year. It is working fine, till I got a new request to break the query down to more details by AwardCode.
So how to include that?
I mean for this section
SUM(1.0 * InvolTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS InvolMov3Mth,
I want to find the last 3 months based on AwardCode.
My original query that is working is
SELECT
Calendar_Date, Mth, NoOfEmp, MaleCount, FemaleCount,
SUM(1.0*InvolTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS InvolMov3Mth,
SUM(1.0*TotalTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS TermSum12Mth
FROM #X
The result is
But now I need to add another group AwardCode
SELECT
Mth, AwardCode, NoOfEmp, MaleCount, FemaleCount,
SUM(1.0 * InvolTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS InvolMov3Mth,
SUM(1.0 * TotalTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS TermSum12Mth
FROM #X
The result will be like this
You can notice that the sum of InvolMov3Mth & TermSum12Mth for the whole period does not match the query above
I think I found the answer for my question.
I used PARTITION BY AwardCode before ORDER BY
seems to be working.
SUM(1.0*TotalTerm) OVER (PARTITION BY AwardCode ORDER BY Calendar_Date ASC
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS TermSum12Mth,
Yes. "Partition by" will make it work for your requirment