How to get change points in oracle select query? - sql

How can I select change points from this data set
1 0
2 0
3 0
4 100
5 100
6 100
7 100
8 0
9 0
10 0
11 100
12 100
13 0
14 0
15 0
I want this result
4 7 100
11 12 100

This query based on analytic functions lag() and lead() gives expected output:
select id, nid, point
from (
select id, point, p1, lead(id) over (order by id) nid
from (
select id, point,
decode(lag(point) over (order by id), point, 0, 1) p1,
decode(lead(point) over (order by id), point, 0, 2) p2
from test)
where p1<>0 or p2<>0)
where p1=1 and point<>0
SQLFiddle
Edit: You may want to change line 3 in case there only one row for changing point:
...
select id, point, p1,
case when p1=1 and p2=2 then id else lead(id) over (order by id) end nid
...

It would be simple to use ROW_NUMBER analytic function, MIN and MAX.
This is a frequently asked question about finding the interval/series of values and skip the gaps. I like the word given to it as Tabibitosan method by Aketi Jyuuzou.
For example,
SQL> SELECT MIN(A),
2 MAX(A),
3 b
4 FROM
5 ( SELECT a,b, a-Row_Number() over(order by a) AS rn FROM t WHERE b <> 0
6 )
7 GROUP BY rn,
8 b
9 ORDER BY MIN(a);
MIN(A) MAX(A) B
---------- ---------- ----------
4 7 100
11 12 100
SQL>

Related

Oralce sql:I want to select the TOP 3 Records [duplicate]

This question already has answers here:
How do I limit the number of rows returned by an Oracle query after ordering?
(14 answers)
Closed 8 months ago.
I want to select the TOP 3 Records ordered desc by 'cnt'
this is top 4
a b c cnt
99 YC 市購件異常 3
99 LY 漏油 2
99 QT16 其他異常 2
99 JGSH 機構損壞 1
then
select * from ()where rownum<= 3 order by cnt desc
get data
99 YC 市購件異常 3
99 LY 漏油 2
99 JGSH 機構損壞 1
i want to get
99 YC 市購件異常 3
99 LY 漏油 2
99 QT16 其他異常 2
Try this:
SELECT T.a, T.b, T.c, T.cnt
FROM
(
SELECT *, RANK() OVER(PARTITION BY a ORDER BY cnt DESC) RNK
FROM TEST_TBL
) T
WHERE T.RNK <= 3
It looks like you want to keep "duplicates" (in the cnt column) in the result.
In that case, I'd say that it is row_number analytic function that helps:
Sample data:
SQL> with test (a, b, cnt) as
2 (select 99, 'yc' , 3 from dual union all
3 select 99, 'ly' , 2 from dual union all
4 select 99, 'qt16', 2 from dual union all
5 select 99, 'jgsh', 1 from dual union all
6 --
7 select 99, 'abc' , 2 from dual --> yet another row with CNT = 2
8 ),
Query begins here: first rank rows (line #11), and then return the top 3 (line #15):
9 temp as
10 (select a, b, cnt,
11 row_number() over (partition by a order by cnt desc) rnk
12 from test
13 )
14 select * from temp
15 where rnk <= 3;
A B CNT RNK
---------- ---- ---------- ----------
99 yc 3 1
99 ly 2 2
99 abc 2 3
SQL>
Because, if you use rank analytic function (as Hana suggested), you might get more than desired 3 rows (see the rnk column's values) (depending on data you work with, of course; rank works with data you posted, but - if there are more rows that share the same cnt value, it won't work any more):
<snip>
9 temp as
10 (select a, b, cnt,
11 rank() over (partition by a order by cnt desc) rnk
12 from test
13 )
14 select * from temp
15 where rnk <= 3;
A B CNT RNK
---------- ---- ---------- ----------
99 yc 3 1
99 ly 2 2
99 abc 2 2
99 qt16 2 2
SQL>

Is it possible to use a aggregate function over partition by as a case condition in SQL?

Problem statement is to calculate median from a table that has two columns. One specifying a number and the other column specifying the frequency of the number.
For e.g.
Table "Numbers":
Num
Freq
1
3
2
3
This median needs to be found for the flattened array with values:
1,1,1,2,2,2
Query:
with ct1 as
(select num,frequency, sum(frequency) over(order by num) as sf from numbers o)
select case when count(num) over(order by num) = 1 then num
when count(num) over (order by num) > 1 then sum(num)/2 end median
from ct1 b where sf <= (select max(sf)/2 from ct1) or (sf-frequency) <= (select max(sf)/2 from ct1)
Is it not possible to use count(num) over(order by num) as the condition in the case statement?
Find the relevant row / 2 rows based of the accumulated frequencies, and take the average of num.
The example and Fiddle will also show you the
computations leading to the result.
If you already know that num is unique, rowid can be removed from the ORDER BY clauses
with
t1 as
(
select t.*
,nvl(sum(freq) over (order by num,rowid rows between unbounded preceding and 1 preceding),0) as freq_acc_sum_1
,sum(freq) over (order by num, rowid) as freq_acc_sum_2
,sum(freq) over () as freq_sum
from t
)
select t1.*
,case
when freq_sum/2 between freq_acc_sum_1 and freq_acc_sum_2
then 'V'
end as relevant_record
from t1
order by num, rowid
Fiddle
Example:
ID
NUM
FREQ
FREQ_ACC_SUM_1
FREQ_ACC_SUM_2
FREQ_SUM
RELEVANT_RECORD
7
8
1
0
1
18
5
10
1
1
2
18
1
29
3
2
5
18
6
31
1
5
6
18
3
33
2
6
8
18
4
41
1
8
9
18
V
9
49
2
9
11
18
V
2
52
1
11
12
18
8
56
3
12
15
18
10
92
3
15
18
18
MEDIAN
45
Fiddle for 1M records
You can find the one (or two) middle value(s) and then average:
SELECT AVG(num) AS median
FROM (
SELECT num,
freq,
SUM(freq) OVER (ORDER BY num) AS cum_freq,
(SUM(freq) OVER () + 1)/2 AS median_freq
FROM table_name
)
WHERE cum_freq - freq < median_freq
AND median_freq < cum_freq + 1
Or, expand the values using a LATERAL join to a hierarchical query and then use the MEDIAN function:
SELECT MEDIAN(num) AS median
FROM table_name t
CROSS JOIN LATERAL (
SELECT LEVEL
FROM DUAL
WHERE freq > 0
CONNECT BY LEVEL <= freq
)
Which, for the sample data:
CREATE TABLE table_name (Num, Freq) AS
SELECT 1, 3 FROM DUAL UNION ALL
SELECT 2, 3 FROM DUAL;
Outputs:
MEDIAN
1.5
(Note: For your sample data, there are 6 items, an even number, so the MEDIAN will be half way between the value of 3rd and 4rd items; so half way between 1 and 2 = 1.5.)
db<>fiddle here

PSQL, adding a "step increasing" column

have this values in a table column select a from tab:
a
1
2
3
4
5
6
7
15
16
18
Using a variable=3, how can create column b starting with min(a) and with the following values:
a
b
1
1
2
1
3
1
4
4
5
4
6
4
7
7
15
15
17
15
18
18
something like: for each a (ordered) maintain the value at most for 3, otherwise reset.
Thanks,
AAWNSD
I think you want window functions and groups of three based on arithmetic on a:
select a,
min(a) over (partition by ceiling(a / 3.0)) as b
from tab;
Here is a db<>fiddle.
Hmmm . . . I realize that the above returns "16" for the last row rather than 18. My above interpretation may not be correct. You may be saying that you want groups -- once they start -- to never exceed the group starting value plus 2.
If so, one approach is a recursive CTE:
with recursive tt as (
select a, row_number() over (order by a) as seqnum
from tab
),
cte as (
select a, seqnum, a as grp
from tt
where seqnum = 1
union all
select tt.a, tt.seqnum,
(case when tt.a <= grp + 2 then grp else tt.a end)
from cte join
tt
on tt.seqnum = cte.seqnum + 1
)
select *
from cte;

What is the most efficient SQL query to find the max N values for every entities in a table

I wrote these 2 queries, the first one is keeping duplicates and the second one is dropping them
Does anyone know a more efficient way to achieve this?
Queries are for MSSQL, returning the top 3 values
1-
SELECT TMP.entity_id, TMP.value
FROM(
SELECT TAB.entity_id, LEAD(TAB.entity_id, 3, 0) OVER(ORDER BY TAB.entity_id, TAB.value) AS next_id, TAB.value
FROM mytable TAB
) TMP
WHERE TMP.entity_id <> TMP.next_id
2-
SELECT TMP.entity_id, TMP.value
FROM(
SELECT TMX.entity_id, LEAD(TMX.entity_id, 3, 0) OVER(ORDER BY TMX.entity_id, TMX.value) AS next_id, TMX.value
FROM(
SELECT TAB.entity_id, LEAD(TAB.entity_id, 1, 0) OVER(ORDER BY TAB.entity_id, TAB.value) AS next_id, TAB.value, LEAD(TAB.value, 1, 0) OVER(ORDER BY TAB.entity_id, TAB.value) AS next_value
FROM mytable TAB
) TMX
WHERE TMP.entity_id <> TMP.next_id OR TMX.value <> TMX.next_value
) TMP
WHERE TMP.entity_id <> TMP.next_id
Example:
Table:
entity_id value
--------- -----
1 9
1 11
1 12
1 3
2 25
2 25
2 5
2 37
3 24
3 9
3 2
3 15
Result Query 1 (25 appears twice for entity_id 2):
entity_id value
--------- -----
1 9
1 11
1 12
2 25
2 25
2 37
3 9
3 15
3 24
Result Query 2 (25 appears only once for entity_id 2):
entity_id value
--------- -----
1 9
1 11
1 12
2 5
2 25
2 37
3 9
3 15
3 24
You can use the ROW_NUMBER which will allow duplicates as follows:
select entity_id, value from
(select t.*, row_number() over (partition by entity_id order by value desc) as rn
from your_Table) where rn <= 3
You can use the rank to remove the duplicate as follows:
select distinct entity_id, value from
(select t.*, rank() over (partition by entity_id order by value desc) as rn
from your_Table) where rn <= 3

Can I start a new group when value changes from 0 to 1?

Can I somehow assign a new group to a row when a value in a column changes in T-SQL?
I would be grateful if you can provide solution that will work on unlimited repeating numbers without CTE and functions. I made a solution that work in sutuation with 100 consecutive identical numbers(with
coalesce(lag()over(), lag() over(), lag() over() ) - it is too bulky
but can not make a solution for a case with unlimited number of consecutive identical numbers.
Data
id somevalue
1 0
2 1
3 1
4 0
5 0
6 1
7 1
8 1
9 0
10 0
11 1
12 0
13 1
14 1
15 0
16 0
Expected
id somevalue group
1 0 1
2 1 2
3 1 2
4 0 3
5 0 3
6 1 4
7 1 4
8 1 4
9 0 5
10 0 5
11 1 6
12 0 7
13 1 8
14 1 8
15 0 9
16 0 9
If you just want a group identifier, you can use:
select t.*,
min(id) over (partition by some_value, seqnum - seqnum_1) as grp
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by somevalue order by id) as sequm_1
from t
) t;
If you want them enumerated . . . well, you can enumerate the id above using dense_rank(). Or you can use lag() and a cumulative sum:
select t.*,
sum(case when some_value = prev_sv then 0 else 1 end) over (order by id) as grp
from (select t.*,
lag(somevalue) over (order by id) as prev_sv
from t
) t;
Here's a different approach:
First I created a view to provide the group increment on each row:
create view increments as
select
n2.id,n2.somevalue,
case when n1.somevalue=n2.somevalue then 0 else 1 end as increment
from
(select 0 as id,1 as somevalue union all select * from mytable) n1
join mytable n2
on n2.id = n1.id+1
Then I used this view to produce the group values as cumulative sums of the increments:
select id, somevalue,
(select sum(increment) from increments i1 where i1.id <= i2.id)
from increments i2