with cte
as
(
SELECT
year(h.orderdate)*100+month(h.orderdate) as yearmonth,
YEAR(h.orderdate) as orderyear,
sum(d.OrderQty*d.UnitPrice) as amount
FROM [AdventureWorks].[Sales].[SalesOrderDetail] d
inner join sales.SalesOrderHeader h
on d.SalesOrderID=h.SalesOrderID
group by
year(h.orderdate)*100+month(h.orderdate),
year(h.orderdate)
)
select
c.*,
last_value(c.amount) over (partition by c.orderyear order by c.yearmonth) as lastvalue,
first_value(c.amount) over (partition by c.orderyear order by c.yearmonth) as firstvalue
from cte c
order by c.yearmonth
I am expecting to see the lastvalue of each year (say december value), similar to the firstvalue of each year (jan value). however, last_value is not working at all. It just returns the same value of that month. What did I do wrong?
Thanks for the help.
Your problem is that the default row range for LAST_VALUE is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, so the value you are getting is the current month's value (that being the last value in that range). To get LAST_VALUE to look at all values in the partition you need to expand the range to include the rows after the current row as well. So you need to change your query to:
last_value(c.amount) over (partition by c.orderyear order by c.yearmonth
rows between unbounded preceding and unbounded following) as lastvalue,
Related
I have two tables:
orders_product: all the orders. Each line is a product sold with some details about the order in which it was included. So, if the order has more than 1 product, there are more than 1 line for this order.
orders_grouped: each line is an order with some details about this specific order.
I would like know if there was a previous purchase and a following purchase for each product.
SELECT
product_name,
last_value(product_all_grouped_list) over (partition by ord.customer_id order by created_at asc rows between unbounded preceding and 1 preceding ) as last_order,
last_value(product_all_grouped_list) over (partition by ord.customer_id order by created_at desc rows between unbounded preceding and 1 preceding ) as next_order_products,
last_value(basket_size) over (partition by ord.customer_id order by created_at desc rows between unbounded preceding and 1 preceding ) as next_order_basket_size
FROM
`orders_product` ord
left join `orders_grouped` ordgroup
on ord.order_number=ordgroup.order_number
When the order has only one product (basket_size=1), everything is correct but when the basket_size>1, the results for the first product of this order is OK but for the rest of products of the order is wrong.
Can someone help me?
Because several orders items are present and thus several rows the windows function has to be different.
RANGE instead of ROWS in the over statement.
Also use window at the end:
With tbl as (
Select * from unnest(generate_timestamp_array("2022-09-01","2022-09-15",interval 1 hour)) update_time
)
SELECT
*,
LAST_VALUE(update_time) OVER (ORDER BY update_time ASC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING ),
timestamp_diff(update_time,timestamp("1999-01-01"),second) ,
LAST_VALUE(update_time) OVER SETUP_window
FROM
tbl
window SETUP_window as (ORDER BY timestamp_diff(update_time,timestamp("1999-01-01"),second) ASC RANGE BETWEEN UNBOUNDED PRECEDING AND 36000 PRECEDING )
order by update_time desc
I am not able to get the last value, rather it is just returning the same value with my code below in snowflake - does anyone have any idea? Is there something glaring wrong?
select MNTH,
sum_cust,
last_value(sum_cust) over (partition by MNTH order by sum_cust desc ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as sum_cust_last
from block_2;
I think what you actually want is to LAG the value from the previous MNTH:
SELECT MNTH,
sum_cust,
LAG(sum_cust) OVER (ORDER BY MNTH) AS sum_cust_last
FROM block_2;
I actually recommend first_value() rather than last_value() for some technical reasons involving window frames. If you want the last value, order by the month desc and choose the first row:
select MNTH, sum_cust,
first_value(sum_cust) over (order by MNTH desc
rows between current_row AND UNBOUNDED FOLLOWING
) as sum_cust_last
from block_2;
I'm running code like this:
SELECT ID, Date, Price,
STDEV(Price) OVER (ORDER BY ID, Date ROWS BETWEEN 30 PRECEDING AND CURRENT ROW) As OneMonths,
STDEV(Price) OVER (ORDER BY ID, Date ROWS BETWEEN 60 PRECEDING AND CURRENT ROW) As TwoMonths,
STDEV(Price) OVER (ORDER BY ID, Date ROWS BETWEEN 90 PRECEDING AND CURRENT ROW) As ThreeMonths
FROM Price_Table
That gives me this result.
In the fiver first row I always have three nulls for the three variances. This makes sense. However, every time the ID changes, I must be getting the preceding ID's prices, because each time the ID changes, I would expect the standard deviation to get reset. So, the first line in orange should be null, I think, and the next one should be zero, because there is no change in price the second day. I tried wrapping the LAG function around the STDEV function and I got an error. I tried the opposite and also got an error.
If you want the value per id, then you need partition by:
SELECT ID, Date, Price,
STDEV(Price) OVER (PARTITION BY ID ORDER BY Date ROWS BETWEEN 30 PRECEDING AND CURRENT ROW) As OneMonths,
STDEV(Price) OVER (PARTITION BY ID ORDER BY Date ROWS BETWEEN 60 PRECEDING AND CURRENT ROW) As TwoMonths,
STDEV(Price) OVER (PARTITION BY ID ORDER BY Date ROWS BETWEEN 90 PRECEDING AND CURRENT ROW) As ThreeMonths
FROM Price_Table;
I was reading this tutorial on how to calculate running totals.
Copying the suggested approach I have a query of the form:
select
date,
sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table;
This works fine and does what I want - a running total by date.
However, in addition to the running total, I'd also like to add daily sales:
select
date,
sum(sales),
sum(sales) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table
group by 1;
This throws an error:
SYNTAX_ERROR: line 6:8: '"sum"("sales") OVER (ORDER BY "activity_date" ASC ROWS UNBOUNDED PRECEDING)' must be an aggregate expression or appear in GROUP BY clause
How can I calculate both daily total as well as running total?
I think you can try it, but it will repeat your daily_sales. In this way you don't need to group by your date field.
SELECT date,
SUM(sales) OVER (PARTITION BY DATE) as daily_sales
SUM(sales) OVER (ORDER BY DATE ROWS UNBOUNDED PRECEDING) as cumulative_sales
FROM sales_table;
Presumably, you intend an aggregation query to begin with:
select date, sum(sales) as daily_sales,
sum(sum(sales)) over (order by date rows unbounded preceding) as cumulative_sales
from sales_table
group by date
order by date;
I have a query that is working fine: The query is to find the sum & Avg for the last 3 months and last year. It is working fine, till I got a new request to break the query down to more details by AwardCode.
So how to include that?
I mean for this section
SUM(1.0 * InvolTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS InvolMov3Mth,
I want to find the last 3 months based on AwardCode.
My original query that is working is
SELECT
Calendar_Date, Mth, NoOfEmp, MaleCount, FemaleCount,
SUM(1.0*InvolTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS InvolMov3Mth,
SUM(1.0*TotalTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS TermSum12Mth
FROM #X
The result is
But now I need to add another group AwardCode
SELECT
Mth, AwardCode, NoOfEmp, MaleCount, FemaleCount,
SUM(1.0 * InvolTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS InvolMov3Mth,
SUM(1.0 * TotalTerm) OVER (ORDER BY Calendar_Date ASC
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS TermSum12Mth
FROM #X
The result will be like this
You can notice that the sum of InvolMov3Mth & TermSum12Mth for the whole period does not match the query above
I think I found the answer for my question.
I used PARTITION BY AwardCode before ORDER BY
seems to be working.
SUM(1.0*TotalTerm) OVER (PARTITION BY AwardCode ORDER BY Calendar_Date ASC
ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) AS TermSum12Mth,
Yes. "Partition by" will make it work for your requirment