LEAD and LAG function by omitting 0 - sql

enter image description here
I want to omit 0 while applying LEAD and LAG function.
For highlighted column, prev should be 116.69635888009 and next should be 108.324381114468
I am using the query below:
select
snapshot_date,
assetname,
prev,
monthly_avg_kw,
next,
(prev+next)/2 as avg
FROM
(select
snapshot_date,
assetname,
monthly_avg_kw,
LAG(monthly_avg_kw) OVER(PARTITION BY assetname ORDER BY snapshot_date ASC) as prev,
LEAD(monthly_avg_kw) OVER(PARTITION BY assetname ORDER BY
snapshot_date ASC) as next
from 'TABLE')
where assetname = 'MI6.UPS-2A-2'

Using MySQL, I was able to create this example, which produces the desired behavior.
Adding (b>0) to the order by causes the previous value to be based only on column where b>0. This will produce NULL when the value of b is 0. In that case we use COALESCE to find the previous value just on the order by a.
CREATE TABLE test (
a integer,
b INTEGER);
INSERT INTO test VALUES
(1,1),
(2,2),
(3,0),
(4,4),
(5,5),
(6,6);
SELECT
a,b,
COALESCE(lag(b) over (order by (b>0),a),lag(b) over (order by a)) as BB
FROM test
ORDER BY a;
output:
a
b
BB
1
1
0
2
2
1
3
0
2
4
4
2
5
5
4
6
6
5
see: DBFIDDLE

Related

(SQL) Per ID, starting from the first row, return all successive rows with a value N greater than the prior returned row

I have the following example dataset:
ID
Value
Row index (for reference purposes only, does not need to exist in final output)
a
4
1
a
7
2
a
12
3
a
12
4
a
13
5
b
1
6
b
2
7
b
3
8
b
4
9
b
5
10
I would like to write a SQL script that returns the next row which has a Value of N or more than the previously returned row starting from the first row per ID and ordered ascending by [Value]. An example of the final table for N = 3 should look like the following:
ID
Value
Row index
a
4
1
a
7
2
a
12
3
b
1
6
b
4
9
Can this script be written in a vectorised manner? Or must a loop be utilised? Any advice would be greatly appreciated. Thanks!
SQL tables represent unordered sets. There is no definition of "previous" value, unless you have a column that specifies the ordering. With such a column, you can use lag():
select t.*
from (select t.*,
lag(value) over (partition by id order by <ordering column>) as prev_value
from t
) t
where prev_value is null or prev_value <= value - 3;
EDIT:
I think I misunderstood what you want to do. You seem to want to start with the first row for each id. Then get the next row that is 3 or higher in value. Then hold onto that value and get the next that is 3 or higher than that. And so on.
You can do this in SQL using a recursive CTE:
with ts as (
select distinct t.id, t.value, dense_rank() over (partition by id order by value) as seqnum
from t
),
cte as (
select id, value, value as grp_value, 1 as within_seqnum, seqnum
from ts
where seqnum = 1
union all
select ts.id, ts.value,
(case when ts.value >= cte.grp_value + 3 then ts.value else cte.grp_value end),
(case when ts.value >= cte.grp_value + 3 then 1 else cte.within_seqnum + 1 end),
ts.seqnum
from cte join
ts
on ts.id = cte.id and ts.seqnum = cte.seqnum + 1
)
select *
from cte
where within_seqnum = 1
order by id, value;
Here is a db<>fiddle.

Can I count number of SUBSEQUENT rows with values larger than current row?

Row Input Output Output Explanation
1 14.93 6 6 because input value on rows 2 to 7 are smaller than row 1
2 9.74 0 0 because input value on row 3 is larger than row 2
3 12.89 0 0 because input value on row 4 is larger than row 3
4 13.09 2 2 because input value on rows 5 to 6 are smaller than row 4
5 7.84 0 0 because input value on row 6 is larger than row 5
6 12.81 0 0 because input value on row 7 is larger than row 6
7 13.15 0 0 because input value on row 8 is larger than row 7
8 18.15 0 0 because input value in row 8 is last in series
Please can you help me with defining the SQL server code for the logic in the table?
I have tried a number of different approaches including recursive CTEs, CAST, LEAD… OVER..., etc. My SQL skills are not up to this challenge, which seems to be easy to describe in words, but difficult to code!
Please not the logic in the last row is different from the rest.
MAX output value should be 244.
declare #t table
(
Row int,
Input decimal(5,2)
);
insert into #t(Row, Input)
values
(1, 14.93),
(2, 9.74),
(3, 12.89),
(4, 13.09),
(5, 7.84),
(6, 12.81),
(7, 13.15),
(8, 18.15);
select *,
case
when lead(a.Input) over(order by a.Row) < a.Input then
(
select count(*) - count(xyz)
from
(
select case when b.Input < a.Input then null else b.Input end as xyz
from #t as b
where b.Row > a.Row
) as c
)
else 0
end as Output
from #t as a;
I don't think this can easily be done with window functions. We need to iterate for each original row, while keeping track of the original value.
I would use a recursive query here:
with
data as (select t.*, row_number() over(order by row) rn from mytable t),
cte as (
select row, rn, input, 0 as output from data
union all
select c.row, d.rn, c.input, c.output + 1
from cte c
inner join data d on d.rn = c.rn + 1 and d.input < c.input
)
select input, max(output) as output
from cte
group by row, input
order by row
For each row, the logic is to iteratively check the following rows. It the following value is smaller than the one on the original row, we increment the output counter; if it is not, the recursion stops for that row. Then all that is left to do is keep the greatest counter per original row.
Demo on DB Fiddle:
input | output
----: | -----:
14.93 | 6
9.74 | 0
12.89 | 0
13.09 | 2
7.84 | 0
12.81 | 0
13.15 | 0
18.15 | 0
You can do this with apply:
with t as (
select t.*, row_number() over (order by row) as seqnum,
1 + count(*) over () as cnt
from mytable t
)
select t.*, coalesce(coalesce(t2.min_seqnum, t.cnt) - t.seqnum - 1, 0) as output
from t outer apply
(select min(t2.seqnum) as min_seqnum
from t t2
where t2.row > t.row and t2.input > t.input
) t2
order by row;
The idea is to find the next row that is bigger than the current row. The slight complication (why cnt is needed) is in case there is no larger row.
Here is a db<>fiddle.
You can use sub-query as follows:
WITH CTE AS
(SELECT T.*,
ROW_NUMBER() OVER (ORDER BY ROW) AS RN
FROM YOUR_TABLE T)
SELECT C.ROW, C.INPUT,
COALESCE((SELECT MIN(CC.RN) - C.RN - 1
FROM CTE CC
WHERE CC.INPUT > C.INPUT AND CC.RN > C.RN)
, 0) AS OUTPUT
FROM CTE C;

Query to group based on the sorted table result

Below is my table
a 1
a 2
a 1
b 1
a 2
a 2
b 3
b 2
a 1
My Expected output is
a 4
b 1
a 4
b 5
a 1
I want them to be grouped if they are in sequence.
If your dbms supports window functions, you can use the row_number difference to assign the same group to consecutive values (which are the same) in one column. After assigning the groups, it is easy to sum the values for each group.
select col1,sum(col2)
from (select t.*,
row_number() over(order by someid)
- row_number() over(partition by col1 order by someid) as grp
from tablename t
) x
group by col1,grp
Replace tablename, col1,col2,someid with the appropriate column names. someid should be the column to be ordered by.

oracle dates group

How to get optimized query for this
date_one | date_two
------------------------
01.02.1999 | 31.05.2003
01.01.2004 | 01.01.2010
02.01.2010 | 10.10.2011
11.10.2011 | (null)
I need to get this
date_one | date_two | group
------------------------------------
01.02.1999 | 31.05.2003 | 1
01.01.2004 | 01.01.2010 | 2
02.01.2010 | 10.10.2011 | 2
11.10.2011 | (null) | 2
The group number is assigned as follows. Order the rows by date_one ascending. First row gets group = 1. Then for each row if date_one is the date immediately following date_two of the previous row, the group number stays the same as in the previous row, otherwise it increases by one.
You can do this using left join and a cumulative sum:
select t.*, sum(case when tprev.date_one is null then 1 else 0 end) over (order by t.date_one) as grp
from t left join
t tprev
on t.date_one = tprev.date_two + 1;
The idea is to find where the gaps begin (using the left join) and then do a cumulative sum of such beginnings to define the group.
If you want to be more inscrutable, you could write this as:
select t.*,
count(*) over (order by t.date_one) - count(tprev.date_one) over (order by t.date_one) as grp
from t left join
t tprev
on t.date_one = tprev.date_two + 1;
One way is using window function:
select
date_one,
date_two,
sum(x) over (order by date_one) grp
from (
select
t.*,
case when
lag(date_two) over (order by date_one) + 1 =
date_one then 0 else 1 end x
from t
);
It finds the date_two from the last row using analytic function lag and check if it in continuation with date_one from this row (in increasing order of date_one).
How it works:
lag(date_two) over (order by date_one)
(In the below explanation, when I say first, next, previous or last row, it's based on increasing order of date_one with null values at the end)
The above produces produces NULL for the first row as there is no row before it to get date_two from and previous row's date_two for the subsequent rows.
case when
lag(date_two)
over (order by date_one) + 1 = date_one then 0
else 1 end
Since, the lag produces NULL for the very first row (since NULL = anything expression always finally evaluates to false), output of case will be 1.
For further rows, similar check will be done to produce a new column x in the query output which has value 1 when the previous row's date_two is not in continuation with this row's date_one.
Then finally, we can do an incremental sum on x to find the required group values. See the value of x below for understanding:
SQL> with t (date_one,date_two) as (
2 select to_date('01.02.1999','dd.mm.yyyy'),to_date('31.05.2003','dd.mm.yyyy') from dual union
all
3 select to_date('01.01.2004','dd.mm.yyyy'),to_date('01.01.2010','dd.mm.yyyy') from dual union
all
4 select to_date('02.01.2010','dd.mm.yyyy'),to_date('10.10.2011','dd.mm.yyyy') from dual union
all
5 select to_date('11.10.2011','dd.mm.yyyy'),null from dual
6 )
7 select
8 date_one,
9 date_two,
10 x,
11 sum(x) over (order by date_one) grp
12 from (
13 select
14 t.*,
15 case when
16 lag(date_two) over (order by date_one) + 1 =
17 date_one then 0 else 1 end x
18 from t
19 );
DATE_ONE DATE_TWO X GRP
--------- --------- ---------- ----------
01-FEB-99 31-MAY-03 1 1
01-JAN-04 01-JAN-10 1 2
02-JAN-10 10-OCT-11 0 2
11-OCT-11 0 2
SQL>

SQL Local Minima and Maxima

I have this data:
row_id type value
1 a 1
2 a 2
3 a 3
4 a 5 --note that type a, value 4 is missing
5 a 6
6 a 7
7 b 1
8 b 2
9 b 3
10 b 4
11 b 5 --note that type b is missing no values from 1 to 5
12 c 1
13 c 3 --note that type c, value 2 is missing
I want to find the minimum and maximum values for each consecutive "run" within each type. That is, I want to return
row_id type group_num min_value max_value
1 a 1 1 3
2 a 2 5 7
3 b 1 1 5
4 c 1 1 1
5 c 2 3 3
I am a fairly experienced SQL user, but I've never solved this problem. Obviously I know how to get the overall minimum and maximum for each type, using GROUP, MIN, and MAX, but I'm really at a loss for these local minima and maxima. I haven't found anything on other questions that answers my question.
I'm using PLSQL Developer with Oracle 11g. Thanks!
This is a gaps-and-islands problem. You can use an analytic function effect/trick to finds the chains of contiguous values for each type:
select type,
min(value) as min_value,
max(value) as max_value
from (
select type, value,
dense_rank() over (partition by type order by value)
- dense_rank() over (partition by null order by value) as chain
from your_table
)
group by type, chain
order by type, min(value);
The inner query uses the difference between the ranking of the values within the type and within the entire result set to create the 'chain' number. The outer query just uses that for the grouping.
SQL Fiddle including the result of the inner query.
This is one way to achieve the result you require:
with step_1 as (
select w.type,
w.value,
w.value - row_number() over (partition by w.type order by w.row_id) as grp
from window_test w
), step_2 as (
select x.type,
x.value,
dense_rank() over (partition by x.type order by x.grp) as grp
from step_1 x
)
select rank() over (order by y.type, y.grp) as row_id,
y.type,
y.grp as group_num,
min(y.value) as min_val,
max(y.value) as max_val
from step_2 y
group by y.type, y.grp
order by 1;