Finding Avg of following dataset - sql

Following is the data.
select * from (
select to_date('20140601','YYYYMMDD') log_date, null weight from dual
union
select to_date('20140601','YYYYMMDD')+1 log_date, 0 weight from dual
union
select to_date('20140601','YYYYMMDD')+2 log_date, 4 weight from dual
union
select to_date('20140601','YYYYMMDD')+3 log_date, 4 weight from dual
union
select to_date('20140601','YYYYMMDD')+4 log_date, null weight from dual
union
select to_date('20140601','YYYYMMDD')+5 log_date, 8 weight from dual);
Log_date weight avg_weight
----------------------------------
6/1/2014 NULL 0 (0/1) Since no previous data, I consider it as 0
6/2/2014 0 0 ((0+0)/2)
6/3/2014 4 4/3 ((0+0+4)/3)
6/4/2014 4 2 (0+0+4+4)/4
6/5/2014 NULL 2 (0+0+4+4+2)/5 Since it is NULL I want to take previous day avg = 2
6/6/2014 8 3 (0+0+4+4+2+8)/6 =3
So the average for the above data should be 3.
How can I achieve this in SQL instead of PLSQL. Appreciate any help on this.

I just learned how to use recursive CTEs today, really excited! Hope this helps...
; WITH RawData (log_Date, Weight) AS (
select cast('2014-06-01' as SMALLDATETIME)+0, null
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+1, 0
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+2, 4
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+3, 4
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+4, null
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+5, 8
)
, IndexedData (Id, log_Date, Weight) AS (
SELECT ROW_NUMBER() OVER (ORDER BY log_Date)
, log_Date
, Weight
FROM RawData
)
, ResultData (Id, log_Date, Weight, total, avg_weight) AS (
SELECT Id
, log_Date
, Weight
, CAST(CASE WHEN Weight IS NULL THEN 0 ELSE Weight END AS FLOAT)
, CAST(CASE WHEN Weight IS NULL THEN 0 ELSE Weight END AS FLOAT)
FROM IndexedData
WHERE Id = 1
UNION ALL
SELECT i.Id
, i.log_Date
, i.Weight
, CAST(r.total + CASE WHEN i.Weight IS NULL THEN r.avg_weight ELSE i.Weight END AS FLOAT)
, CAST(r.total + CASE WHEN i.Weight IS NULL THEN r.avg_weight ELSE i.Weight END AS FLOAT) / i.Id
FROM ResultData r
JOIN IndexedData i ON i.Id = r.Id + 1
)
SELECT Log_Date, Weight, avg_weight FROM ResultData
OPTION (MAXRECURSION 0)
This gives the output:
Log_Date Weight avg_weight
----------------------- ----------- ----------------------
2014-06-01 00:00:00 NULL 0
2014-06-02 00:00:00 0 0
2014-06-03 00:00:00 4 1.33333333333333
2014-06-04 00:00:00 4 2
2014-06-05 00:00:00 NULL 2
2014-06-06 00:00:00 8 3
Note that in my answer, I modified the "Data" section of your question as it didn't compile for me. It's still the same data though, hope it helps.
Edit: By default, MAXRECURSION is set to 100. This means that the query will not work for more than 101 rows of Raw Data. By adding the OPTION (MAXRECURSION 0), I have removed this limit so that the query works for all input data. However, this can be dangerous if the query isn't tested thoroughly because it might lead to infinite recursion.

Related

Show data that takes into consideration consecutive weeks [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Hello I have such a situation I basically need to write a SQL code for such a statement
select *,
case when 'Issue' IN ('Overforecasted', 'Underforecasted') AND 'Start Date' DISTINCT 3 dates THEN 'Issue exists for 3 weeks in a row'
FROM Merged;
I know this is not a proper SQL format but does someone know how it can be edited?
Per one DMDUNIT check if it has 3 issues in column "Issues" and later check if it has 3 different start dates. If it has 3 issues ('Overforecasted', "Underforecasted") and 3 different dates for the same DMDUNIT I need to return it in a new column (end as "3InARow")
The current edited draft
SET ARITHABORT OFF
SET ANSI_WARNINGS OFF;
;WITH Forecast AS (
SELECT LOC, DMDUNIT, STARTDATE, TOTFCST
FROM SCPOMGR.FCSTPERFSTATIC
WHERE STARTDATE >= '2021-11-24'
), Actuals AS (
SELECT LOC, DMDUNIT, DMDPostDate, HistoryQuantity
FROM SCPOMGR.HISTWIDE_CHAIN
WHERE DMDPostDate >= '2021-11-24'
), Merged as (
select
COALESCE(f.LOC, a.LOC) AS LOC,
COALESCE(f.DMDUNIT, a.DMDUNIT) AS DMDUNIT,
COALESCE(f.STARTDATE, a.DMDPostDate) AS "Start Date",
SUM(F.TOTFCST) AS "Forecast",
SUM(a.HistoryQuantity) AS "Actuals",
SUM(ABS(a.HistoryQuantity) - f.TOTFCST) AS "Abs Error",
(1 - HistoryQuantity - TOTFCST) / HistoryQuantity as "FA%",
SUM(a.HistoryQuantity) / SUM(f.TOTFCST) AS "Bias",
CASE
WHEN TOTFCST > HistoryQuantity THEN 'Overforecasted'
WHEN TOTFCST < HistoryQuantity THEN 'Underforecasted'
WHEN HistoryQuantity IS NULL AND TOTFCST > 0 THEN 'Overforecasted'
WHEN TOTFCST IS NULL AND HistoryQuantity > 0 THEN 'Underforecasted'
WHEN TOTFCST = 0.000 AND HistoryQuantity IS NULL THEN 'No issue'
END AS Issue
FROM Forecast f FULL OUTER JOIN Actuals a
ON f.LOC = a.LOC AND f.DMDUNIT = a.DMDUNIT AND f.STARTDATE = a.DMDPostDate
GROUP BY
COALESCE(f.LOC, a.LOC),
COALESCE(f.DMDUNIT, a.DMDUNIT),
COALESCE(f.STARTDATE, a.DMDPostDate),
a.HistoryQuantity, F.TOTFCST),
Transitions as (
select *,
case when indicator <> lag(indicator)
over (partition by DMDUNIT order by "Start Date")
then 1 end as tripped
from Merged cross apply (
select case when Issue in ('Overforecasted', 'Underforecasted')
then 1 else 0 end as indicator) v
), Bundles as (
select *, count(tripped) over (partition by DMDUNIT order by "Start Date") as grp
from Transitions
), Streaks as (
select *, count(*) over (partition by DMDUNIT, grp) as cnt
from Bundles
)
select *, case when indicator = 1 and cnt >= 3 then 'Yes' else 'No' end as InIssueStreak, cnt as StreakLength
from Streaks;
WITH Forecast AS (
SELECT LOC, DMDUNIT, STARTDATE, TOTFCST
FROM SCPOMGR.FCSTPERFSTATIC
WHERE STARTDATE >= '2021-11-24'
), Actuals AS (
SELECT LOC, DMDUNIT, DMDPostDate, HistoryQuantity
FROM SCPOMGR.HISTWIDE_CHAIN
WHERE DMDPostDate >= '2021-11-24'
), Merged AS (
SELECT
COALESCE(f.LOC, a.LOC) AS LOC,
COALESCE(f.DMDUNIT, a.DMDUNIT) AS DMDUNIT,
COALESCE(f.STARTDATE, a.DMDPostDate) AS "Start Date",
SUM(F.TOTFCST) AS "Forecast",
SUM(a.HistoryQuantity) AS "Actuals",
SUM(ABS(a.HistoryQuantity) - f.TOTFCST) AS "Abs Error"
(1 - SUM(a.HistoryQuantity - SUM(f.TOTFCST)) / SUM(a.HistoryQuantity) as "FA%",
SUM(a.HistoryQuantity) / SUM(f.TOTFCST) AS "Bias",
CASE
WHEN SUM(f.TOTFCST) > SUM(a.HistoryQuantity) THEN 'Overforecasted'
WHEN SUM(f.TOTFCST) < SUM(a.HistoryQuantity) THEN 'Underforecasted'
WHEN SUM(a.HistoryQuantity) IS NULL AND SUM(f.TOTFCST) > 0 THEN 'Overforecasted'
WHEN SUM(f.TOTFCST) IS NULL AND SUM(a.HistoryQuantity) > 0 THEN 'Underforecasted'
WHEN SUM(f.TOTFCST) = 0.000 AND SUM(a.HistoryQuantity) IS NULL THEN 'No issue'
END AS Issue
FROM Forecast f FULL OUTER JOIN Actuals a
ON f.LOC = a.LOC AND f.DMDUNIT = a.DMDUNIT AND f.STARTDATE = a.DMDPostDate
GROUP BY
COALESCE(f.LOC, a.LOC),
COALESCE(f.DMDUNIT, a.DMDUNIT),
COALESCE(f.STARTDATE, a.DMDPostDate)
ORDER BY
COALESCE(f.LOC, a.LOC),
COALESCE(f.DMDUNIT, a.DMDUNIT),
COALESCE(f.STARTDATE, a.DMDPostDate)
)
select *,
case when
min(Issue) over (
partition by DMDUNIT order by "Start Date"
rows between 2 preceding and current row) =
max(Issue) over (
partition by DMDUNIT order by "Start Date"
rows between 2 preceding and current row) and
count(Issue) over (
partition by DMDUNIT order by "Start Date"
rows between 2 preceding and current row) = 3
then 'Yes' else 'No' end as "3InARow"
from Merged;
If that doesn't work then try gaps and islands:
with (<copied from above...>), Transitions as (
select *,
case when indicator <> lag(indicator)
over (partition by DMDINIT order by "Start Date")
then 1 end as tripped
from Merged cross apply (
select case when Issue in ('Overforecasted', 'Underforecasted')
then 1 else 0 end as indicator) v
), Bundles as (
select *, sum(tripped) over (partition by DMDUNIT order by "Start Date") as grp
from Transitions
), Streaks as (
select *, count(*) over (partition by DMDUNIT, grp) as cnt
from Bundles
)
select *, case when cnt >= 3 then 'Yes' else 'No' end as InStreak, cnt as StreakLength
from Streaks;
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=9fcbab1d93b7297aebc340111aa3a448
The fiddle
I've created a working test case which demonstrates how to identify consecutive weeks of over or under forecast items.
All the detail about how a weekly value is calculated or how you identify items is immaterial to the basic question. I've ignored issues related to missing data for a week. That can be easily included and just obscures the fundamental (and only) question asked.
For this example, just assume we have only the weeks of interest and all weeks for all items exist.
The SQL, which also contains the data:
WITH forecast (item, prediction) AS (
SELECT 1, 1000 UNION
SELECT 2, 500
)
, actuals (weekno, item, actual) AS (
SELECT 1, 1, 500 UNION
SELECT 2, 1, 600 UNION
SELECT 3, 1, 1600 UNION
SELECT 4, 1, 1600 UNION
SELECT 5, 1, 600 UNION
SELECT 6, 1, 1100 UNION
SELECT 7, 1, 1200 UNION
SELECT 8, 1, 1150 UNION
SELECT 9, 1, 601 UNION
SELECT 10, 1, 602 UNION
SELECT 11, 1, 603 UNION
SELECT 1, 2, 1500 UNION
SELECT 2, 2, 600 UNION
SELECT 3, 2, 550 UNION
SELECT 4, 2, 500 UNION
SELECT 5, 2, 600 UNION
SELECT 6, 2, 491 UNION
SELECT 7, 2, 492 UNION
SELECT 8, 2, 493 UNION
SELECT 9, 2, 494 UNION
SELECT 10, 2, 620
)
, step1 AS (
SELECT a.*
, f.prediction
, CASE WHEN actual < prediction THEN -1
WHEN actual > prediction THEN +1
ELSE 0
END AS side
FROM forecast AS f
JOIN actuals AS a
ON f.item = a.item
)
, step2 AS (
SELECT *
, CASE WHEN side = LAG(side) OVER (PARTITION BY item ORDER BY weekno) THEN 0 ELSE 1 END AS edge
FROM step1
)
, step3 AS (
SELECT *
, SUM(edge) OVER (PARTITION BY item ORDER BY weekno) AS xgroup
FROM step2
)
, step4 AS (
SELECT *
, COUNT(*) OVER (PARTITION BY item, xgroup) AS xcount
FROM step3
)
SELECT *
FROM step4
WHERE xcount >= 3
ORDER BY item, weekno
;
The result:
weekno
item
actual
prediction
side
edge
xgroup
xcount
6
1
1100
1000
1
1
4
3
7
1
1200
1000
1
0
4
3
8
1
1150
1000
1
0
4
3
9
1
601
1000
-1
1
5
3
10
1
602
1000
-1
0
5
3
11
1
603
1000
-1
0
5
3
1
2
1500
500
1
1
1
3
2
2
600
500
1
0
1
3
3
2
550
500
1
0
1
3
6
2
491
500
-1
1
4
4
7
2
492
500
-1
0
4
4
8
2
493
500
-1
0
4
4
9
2
494
500
-1
0
4
4

Oracle SQL recursive adding values

I have the following data in the table
Period Total_amount R_total
01/01/20 2 2
01/02/20 5 null
01/03/20 3 null
01/04/20 8 null
01/05/20 31 null
Based on the above data I would like to have the following situation.
Period Total_amount R_total
01/01/20 2 2
01/02/20 5 3
01/03/20 3 0
01/04/20 8 8
01/05/20 31 23
Additional data
01/06/20 21 0 (previously it would be -2)
01/07/20 25 25
01/08/20 29 4
Pattern to the additional data is:
if total_amount < previous(r_total) then 0
Based on the filled data, we can spot the pattern is:
R_total = total_amount - previous(R_total)
Could you please help me out with this issue?
As Gordon Linoff suspected, it is possible to solve this problem with analytic functions. The benefit is that the query will likely be much faster. The price to pay for that benefit is that you need to do a bit of math beforehand (before ever thinking about "programming" and "computers").
A bit of elementary arithmetic shows that R_TOTAL is an alternating sum of TOTAL_AMOUNT. This can be arranged easily by using ROW_NUMBER() (to get the signs) and then an analytic SUM(), as shown below.
Table setup:
create table sample_data (period, total_amount) as
select to_date('01/01/20', 'mm/dd/rr'), 2 from dual union all
select to_date('01/02/20', 'mm/dd/rr'), 5 from dual union all
select to_date('01/03/20', 'mm/dd/rr'), 3 from dual union all
select to_date('01/04/20', 'mm/dd/rr'), 8 from dual union all
select to_date('01/05/20', 'mm/dd/rr'), 31 from dual
;
Query and result:
with
prep (period, total_amount, sgn) as (
select period, total_amount,
case mod(row_number() over (order by period), 2) when 0 then 1 else -1 end
from sample_data
)
select period, total_amount,
sgn * sum(sgn * total_amount) over (order by period) as r_total
from prep
;
PERIOD TOTAL_AMOUNT R_TOTAL
-------- ------------ ----------
01/01/20 2 2
01/02/20 5 3
01/03/20 3 0
01/04/20 8 8
01/05/20 31 23
This may be possible with window functions, but the simplest method is probably a recursive CTE:
with t as (
select t.*, row_number() over (order by period) as seqnum
from yourtable t
),
cte(period, total_amount, r_amount, seqnum) as (
select period, total_amount, r_amount, seqnum
from t
where seqnum = 1
union all
select t.period, t.total_amount, t.total_amount - cte.r_amount, t.seqnum
from cte join
t
on t.seqnum = cte.seqnum + 1
)
select *
from cte;
This question explicitly talks about "recursively" adding values. If you want to solve this using another mechanism, you might explain the logic in detail and ask if there is a non-recursive CTE solution.

SQL (Vertica) - Calculate number of users who returned to the app at least x days in the past 7 days

Suppose I have my table like:
uid day_used_app
--- -------------
1 2012-04-28
1 2012-04-29
1 2012-04-30
2 2012-04-29
2 2012-04-30
2 2012-05-01
2 2012-05-21
2 2012-05-22
Suppose I want the number of unique users who returned to the app at least 2 different days in the last 7 days (from 2012-05-03).
So as an example to retrieve the number of users who have used the application on at least 2 different days in the past 7 days:
select count(distinct case when num_different_days_on_app >= 2
then uid else null end) as users_return_2_or_more_days
from (
select uid,
count(distinct day_used_app) as num_different_days_on_app
from table
where day_used_app between current_date() - 7 and current_date()
group by 1
)
This gives me:
users_return_2_or_more_days
---------------------------
2
The question I have is:
What if I want to do this for every day up to now so that my table looks like this, where the second field equals the number of unique users who returned 2 or more different days within a week prior to the date in the first field.
date users_return_2_or_more_days
-------- ---------------------------
2012-04-28 2
2012-04-29 2
2012-04-30 3
2012-05-01 4
2012-05-02 4
2012-05-03 3
Would this help?
WITH
-- your original input, don't use in "real" query ...
input(uid,day_used_app) AS (
SELECT 1,DATE '2012-04-28'
UNION ALL SELECT 1,DATE '2012-04-29'
UNION ALL SELECT 1,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-04-29'
UNION ALL SELECT 2,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-05-01'
UNION ALL SELECT 2,DATE '2012-05-21'
UNION ALL SELECT 2,DATE '2012-05-22'
)
-- end of input, start "real" query here, replace ',' with 'WITH'
,
one_week_b4 AS (
SELECT
uid
, day_used_app
, day_used_app -7 AS day_used_1week_b4
FROM input
)
SELECT
one_week_b4.uid
, one_week_b4.day_used_app
, count(*) AS users_return_2_or_more_days
FROM one_week_b4
JOIN input
ON input.day_used_app BETWEEN one_week_b4.day_used_1week_b4 AND one_week_b4.day_used_app
GROUP BY
one_week_b4.uid
, one_week_b4.day_used_app
HAVING count(*) >= 2
ORDER BY 1;
Output is:
uid|day_used_app|users_return_2_or_more_days
1|2012-04-29 | 3
1|2012-04-30 | 5
2|2012-04-29 | 3
2|2012-04-30 | 5
2|2012-05-01 | 6
2|2012-05-22 | 2
Does that help your needs?
Marco the Sane ...
SELECT DISTINCT
t1.day_used_app,
(
SELECT SUM(CASE WHEN t.num_visits >= 2 THEN 1 ELSE 0 END)
FROM
(
SELECT uid,
COUNT(DISTINCT day_used_app) AS num_visits
FROM table
WHERE day_used_app BETWEEN t1.day_used_app - 7 AND t1.day_used_app
GROUP BY uid
) t
) AS users_return_2_or_more_days
FROM table t1

Oracle SQL Trending MTD Data

I am trying to solve a trending problem at work very similar to the below example. I think I have a method but don't know how to do it in SQL.
The input data is:
MTD LOC_ID RAINED
1-Apr-16 1 Y
1-Apr-16 2 N
1-May-16 1 N
1-May-16 2 N
1-Jun-16 1 N
1-Jun-16 2 N
1-Jul-16 1 Y
1-Jul-16 2 N
1-Aug-16 1 N
1-Aug-16 2 Y
The desired output is:
MTD LOC_ID RAINED TRENDS
1-Apr-16 1 Y New
1-May-16 1 N No Rain
1-Jun-16 1 N No Rain
1-Jul-16 1 Y Carryover
1-Aug-16 1 N No Rain
1-Apr-16 2 N No Rain
1-May-16 2 N No Rain
1-Jun-16 2 N No Rain
1-Jul-16 2 N No Rain
1-Aug-16 2 Y New
I'm trying to produce the output from the input by trending on MTD without depending on it. This way, when new months are added to the input, the output changes without editing the query.
The logic for TRENDS will occur on each unique LOC_ID. Trends will have three values: "New" in the first month RAINED is "Y", "Carryover" in any following months where RAINED is "Y", and "No Rain" in any months where RAINED is "N".
I'd like to automate this problem by introducing an intermediate step with a listagg. For example, for LOC_ID = "1":
MTD LOC_ID RAINED PREV_RAINED
1-Apr-16 1 Y (null) / 0 / (I don't care)
1-May-16 1 N Y
1-Jun-16 1 N Y;N
1-Jul-16 1 Y Y;N;N
1-Aug-16 1 N Y;N;N;Y
This way, to produce "TRENDS" in the output, I can say:
case when RAINED = 'Y' then
case when not regexp_like(PREV_RAINED, 'Y', 'i') then
'New'
else
'Carryover'
end
else
'No Rain'
end as TRENDS
My problem is that I'm not sure how to produce PREV_RAINED for each unique LOC_ID. I have a feeling it needs to combine LAG() statements and partition by LOC_ID order by MTD, but the number of lags I need to do depends on each month.
Is there an easy way to produce PREV_RAINED or a simpler way to solve my overall problem while preserving automation each month?
Thanks for reading all of this! :)
In the below SQL there are two parts.
(i) Calculating the ROWNUMBER value for rained attribute at loc_id,rained level.
(ii) Get the count at partition level loc_id,rained.
By computing the above two we can write the CASE WHEN logic to calculate the trends based on your requirement.
SELECT mtd,
loc_id,
rained,
CASE WHEN rained = 'N' THEN 'No Rain'
WHEN rained = 'Y' AND rn = 1 THEN 'New'
ELSE 'Carry Over'
END AS Trends
FROM
(
SELECT mtd,
loc_id,
rained,
ROW_NUMBER() OVER ( PARTITION BY loc_id,rained ORDER BY mtd ) AS rn,
COUNT(*) OVER ( PARTITION BY loc_id,rained ) AS count_locid_rained
FROM INPUT
ORDER BY loc_id,mtd,rained,rn
) X;
Here is a solution for older versions. The WITH clause is for input data; the solution starts right after the WITH clause.
I'll work on a MATCH_RECOGNIZE solution next, I may add it to this answer.
with
input_data ( mtd, loc_id, rained ) as (
select to_date('1-Apr-16', 'dd-Mon-rr'), 1, 'Y' from dual union all
select to_date('1-Apr-16', 'dd-Mon-rr'), 2, 'N' from dual union all
select to_date('1-May-16', 'dd-Mon-rr'), 1, 'N' from dual union all
select to_date('1-May-16', 'dd-Mon-rr'), 2, 'N' from dual union all
select to_date('1-Jun-16', 'dd-Mon-rr'), 1, 'N' from dual union all
select to_date('1-Jun-16', 'dd-Mon-rr'), 2, 'N' from dual union all
select to_date('1-Jul-16', 'dd-Mon-rr'), 1, 'Y' from dual union all
select to_date('1-Jul-16', 'dd-Mon-rr'), 2, 'N' from dual union all
select to_date('1-Aug-16', 'dd-Mon-rr'), 1, 'N' from dual union all
select to_date('1-Aug-16', 'dd-Mon-rr'), 2, 'Y' from dual
)
select mtd, loc_id, rained,
case rained when 'N' then 'No Rain'
else case when rn = 1 then 'New'
else 'Carryover' end
end as trends
from ( select mtd, loc_id, rained,
row_number() over (partition by loc_id, rained order by mtd) rn
from input_data
)
order by loc_id, mtd
;
Output
MTD LOC_ID RAINED TRENDS
------------------- ---------- ------ ---------
01/04/2016 00:00:00 1 Y New
01/05/2016 00:00:00 1 N No Rain
01/06/2016 00:00:00 1 N No Rain
01/07/2016 00:00:00 1 Y Carryover
01/08/2016 00:00:00 1 N No Rain
01/04/2016 00:00:00 2 N No Rain
01/05/2016 00:00:00 2 N No Rain
01/06/2016 00:00:00 2 N No Rain
01/07/2016 00:00:00 2 N No Rain
01/08/2016 00:00:00 2 Y New
10 rows selected
Solution using MATCH_RECOGNIZE (for Oracle 12c only). Test the different solutions on your dataset; I am told that MATCH_RECOGNIZE may be significantly faster than other solutions, but this depends on many factors.
select loc_id, mtd, rained, trends
from input_data
match_recognize (
partition by loc_id, rained
order by mtd
measures mtd as mtd,
case when rained = 'N' then 'No Rain'
else case when match_number() = 1 then 'New' else 'Carryover' end
end as trends
pattern (a)
define a as 0 = 0
)
order by loc_id, mtd;

SQL - Count number of changes in an ordered list

Say I've got a table with two columns (date and price). If I select over a range of dates, then is there a way to count the number of price changes over time?
For instance:
Date | Price
22-Oct-11 | 3.20
23-Oct-11 | 3.40
24-Oct-11 | 3.40
25-Oct-11 | 3.50
26-Oct-11 | 3.40
27-Oct-11 | 3.20
28-Oct-11 | 3.20
In this case, I would like it to return a count of 4 price changes.
Thanks in advance.
You can use the analytic functions LEAD and LAG to access to prior and next row of a result set and then use that to see if there are changes.
SQL> ed
Wrote file afiedt.buf
1 with t as (
2 select date '2011-10-22' dt, 3.2 price from dual union all
3 select date '2011-10-23', 3.4 from dual union all
4 select date '2011-10-24', 3.4 from dual union all
5 select date '2011-10-25', 3.5 from dual union all
6 select date '2011-10-26', 3.4 from dual union all
7 select date '2011-10-27', 3.2 from dual union all
8 select date '2011-10-28', 3.2 from dual
9 )
10 select sum(is_change)
11 from (
12 select dt,
13 price,
14 lag(price) over (order by dt) prior_price,
15 (case when lag(price) over (order by dt) != price
16 then 1
17 else 0
18 end) is_change
19* from t)
SQL> /
SUM(IS_CHANGE)
--------------
4
Try this
select count(*)
from
(select date,price from table where date between X and Y
group by date,price )
Depending on the Oracle version use either analytical functions (see answer from Justin Cave) or this
SELECT
SUM (CASE WHEN PREVPRICE != PRICE THEN 1 ELSE 0 END) CNTCHANGES
FROM
(
SELECT
C.DATE,
C.PRICE,
MAX ( D.PRICE ) PREVPRICE
FROM
(
SELECT
A.Date,
A.Price,
(SELECT MAX (B.DATE) FROM MyTable B WHERE B.DATE < A.DATE) PrevDate
FROM MyTable A
WHERE A.DATE BETWEEN YourStartDate AND YourEndDate
) C
INNER JOIN MyTable D ON D.DATE = C.PREVDATE
GROUP BY C.DATE, C.PRICE
)