How to count consecutive duplicates in a table?

How to count consecutive duplicates in a table? - sql

I have below question:
Want to find the consecutive duplicates
SLNO NAME PG
1 A1 NO
2 A2 YES
3 A3 NO
4 A4 YES
6 A5 YES
7 A6 YES
8 A7 YES
9 A8 YES
10 A9 YES
11 A10 NO
12 A11 YES
13 A12 NO
14 A14 NO
We will consider the value of PG column and I need the output as 6 which is the count of maximum consecutive duplicates.

It can be done with Tabibitosan method. Run this, to understand it:
with a as(
select 1 slno, 'A' pg from dual union all
select 2 slno, 'A' pg from dual union all
select 3 slno, 'B' pg from dual union all
select 4 slno, 'A' pg from dual union all
select 5 slno, 'A' pg from dual union all
select 6 slno, 'A' pg from dual
)
select slno, pg, newgrp, sum(newgrp) over (order by slno) grp
from(
select slno,
pg,
case when pg <> nvl(lag(pg) over (order by slno),1) then 1 else 0 end newgrp
from a
);
Newgrp means a new group is found.
Result:
SLNO PG NEWGRP GRP
1 A 1 1
2 A 0 1
3 B 1 2
4 A 1 3
5 A 0 3
6 A 0 3
Now, just use a group by with count, to find the group with maximum number of occurrences:
with a as(
select 1 slno, 'A' pg from dual union all
select 2 slno, 'A' pg from dual union all
select 3 slno, 'B' pg from dual union all
select 4 slno, 'A' pg from dual union all
select 5 slno, 'A' pg from dual union all
select 6 slno, 'A' pg from dual
),
b as(
select slno, pg, newgrp, sum(newgrp) over (order by slno) grp
from(
select slno, pg, case when pg <> nvl(lag(pg) over (order by slno),1) then 1 else 0 end newgrp
from a
)
)
select max(cnt)
from (
select grp, count(*) cnt
from b
group by grp
);

with test as (
select 1 slno,'A1' name ,'NO' pg from dual union all
select 2,'A2','YES' from dual union all
select 3,'A3','NO' from dual union all
select 4,'A4','YES' from dual union all
select 6,'A5','YES' from dual union all
select 7,'A6','YES' from dual union all
select 8,'A7','YES' from dual union all
select 9,'A8','YES' from dual union all
select 10,'A9','YES' from dual union all
select 11,'A10','NO' from dual union all
select 12,'A11','YES' from dual union all
select 13,'A12','NO' from dual union all
select 14,'A14','NO' from dual),
consecutive as (select row_number() over(order by slno) rr, x.*
from test x)
select x.* from Consecutive x
left join Consecutive y on x.rr = y.rr+1 and x.pg = y.pg
where y.rr is not null
order by x.slno
And you can control output with condition in where.
where y.rr is not null query returns duplicates
where y.rr is null query returns "distinct" values.

Just for completeness, here's the actual Tabibitosan method:
with sample_data as (select 1 slno, 'A1' name, 'NO' pg from dual union all
select 2 slno, 'A2' name, 'YES' pg from dual union all
select 3 slno, 'A3' name, 'NO' pg from dual union all
select 4 slno, 'A4' name, 'YES' pg from dual union all
select 6 slno, 'A5' name, 'YES' pg from dual union all
select 7 slno, 'A6' name, 'YES' pg from dual union all
select 8 slno, 'A7' name, 'YES' pg from dual union all
select 9 slno, 'A8' name, 'YES' pg from dual union all
select 10 slno, 'A9' name, 'YES' pg from dual union all
select 11 slno, 'A10' name, 'NO' pg from dual union all
select 12 slno, 'A11' name, 'YES' pg from dual union all
select 13 slno, 'A12' name, 'NO' pg from dual union all
select 14 slno, 'A14' name, 'NO' pg from dual)
-- end of mimicking a table called "sample_data" containing your data; see SQL below:
select max(cnt) max_pg_in_queue
from (select count(*) cnt
from (select slno,
name,
pg,
row_number() over (order by slno)
- row_number() over (partition by pg
order by slno) grp
from sample_data)
where pg = 'YES'
group by grp);
MAX_PG_IN_QUEUE
---------------
6

SELECT MAX(consecutives) -- Block 1
FROM (
SELECT t1.pg, t1.slno, COUNT(*) AS consecutives -- Block 2
FROM test t1 INNER JOIN test t2 ON t1.pg = t2.pg
WHERE t1.slno <= t2.slno
AND NOT EXISTS (
SELECT * -- Block 3
FROM test t3
WHERE t3.slno > t1.slno
AND t3.slno < t2.slno
AND t3.pg != t1.pg
)
GROUP BY t1.pg, t1.slno
);
The query calculates the result in following way:
Extract all couples of records that don't have a record with different value of PG in between (blocks 2 and 3)
Group them by PG value and starting SLNO value -> this counts the consecutive values for any [PG, (starting) SLNO] couple (block 2);
Extract Maximum value from query 2 (block 1)
Note that the query may be simplified if the slno field in table contains consecutive values, but this seems not your case (in your example record with SLNO = 5 is missing)

Only requiring a single aggregation query and no joins (the rest of the calculation can be done with ROW_NUMBER, LAG and LAST_VALUE):
SELECT MAX( num_before_in_queue ) AS max_sequential_in_queue
FROM (
SELECT rn - LAST_VALUE( has_changed ) IGNORE NULL OVER ( ORDER BY ROWNUM ) + 1
AS num_before_in_queue
FROM (
SELECT pg,
ROW_NUMBER() OVER ( ORDER BY slno ) AS rn,
CASE pg WHEN LAG( pg ) OVER ( ORDER BY slno )
THEN NULL
ELSE ROW_NUMBER() OVER ( ORDER BY sl_no )
END AS change
FROM table_name
)
WHERE pg = 'Y'
);

Try to use row_number()
select
SLNO,
Name,
PG,
row_number() over (partition by PG order by PG) as 'Consecutive'
from
<table>
order by
SLNO,
NAME,
PG
This is should work with minor tweaking.
--EDIT--
Sorry, partiton by PG.
The partitioning tells the row_number when to start a new sequence.

Related

create date range from day based data

i have following source data...
id date value
1 01.08.22 a
1 02.08.22 a
1 03.08.22 a
1 04.08.22 b
1 05.08.22 b
1 06.08.22 a
1 07.08.22 a
2 01.08.22 a
2 02.08.22 a
2 03.08.22 c
2 04.08.22 a
2 05.08.22 a
and i would like to have the following output...
id date_from date_until value
1 01.08.22 03.08.22 a
1 04.08.22 05.08.22 b
1 06.08.22 07.08.22 a
2 01.08.22 02.08.22 a
2 03.08.22 03.08.22 c
2 04.08.22 05.08.22 a
Is this possible with Oracle SQL? Which functions do I need for this?

Based on the link provided by #astentx, try this solution:
SELECT
id, MIN("date") AS date_from, MAX("date") AS date_until, MAX(value) AS value
FROM (
SELECT
t1.*,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY "date") -
ROW_NUMBER() OVER(PARTITION BY id, value ORDER BY "date") AS rn
FROM yourtable t1
)
GROUP BY id, rn
See db<>fiddle

WITH CTE (id, dateD,valueD)
AS
(
SELECT 1, TO_DATE('01.08.22','DD.MM.YY'), 'a' FROM DUAL UNION ALL
SELECT 1, TO_DATE('02.08.22','DD.MM.YY'), 'a'FROM DUAL UNION ALL
SELECT 1, TO_DATE('03.08.22','DD.MM.YY'), 'a'FROM DUAL UNION ALL
SELECT 1, TO_DATE('04.08.22','DD.MM.YY'), 'b'FROM DUAL UNION ALL
SELECT 1, TO_DATE('05.08.22','DD.MM.YY'), 'b'FROM DUAL UNION ALL
SELECT 2, TO_DATE('01.08.22','DD.MM.YY'), 'a'FROM DUAL UNION ALL
SELECT 2, TO_DATE('02.08.22','DD.MM.YY'), 'a'FROM DUAL UNION ALL
SELECT 2, TO_DATE('03.08.22','DD.MM.YY'), 'c'FROM DUAL
)
SELECT C.ID,C.VALUED,MIN(C.DATED)AS MIN_DATE,MAX(C.DATED)AS MAX_DATE
FROM CTE C
GROUP BY C.ID,C.VALUED
ORDER BY C.ID
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=47c87d60445ce262cd371177e31d5d63

how to find the maximum occurence of a string in Oracle SQL developer

i have 2 columns in a table. Data looks like this
Folio_no | Flag
1145 R
201 S
1145 FR
300 E
1145 R
201 E
201 S
Expected Output:
Folio_No | Flag
1145 R
201 S
300 E
The output should give the folio_no along with the flag which occured maximum number of times for that particular folio number.
i tried doing the below but it throws an error
select folio_no, max(count(flag)) from table group by folio_no;

We can use an aggregation:
WITH cte AS (
SELECT Folio_No, Flag, COUNT(*) AS cnt
FROM yourTable
GROUP BY Folio_No, Flag
),
cte2 AS (
SELECT t.*, RANK() OVER (PARTITION BY Folio_No ORDER BY cnt DESC, Flag) rnk
FROM cte t
)
SELECT Folio_No, Flag
FROM cte2
WHERE rnk = 1;
Note that I assume should two flags within a given folio number be tied for the max frequency, that you want to report the earlier flag.
Here is a working demo.

If you want the flag(s) that have the maximum occurrence for each folio then you can use:
SELECT Folio_No, Flag
FROM (
SELECT Folio_No,
Flag,
RANK() OVER (PARTITION BY Folio_No ORDER BY COUNT(*) DESC) AS rnk
FROM table_name
GROUP BY Folio_No, Flag
)
WHERE rnk = 1;
Which, for the sample data:
CREATE TABLE table_name (folio_no, flag) AS
SELECT 1145, 'R' FROM DUAL UNION ALL
SELECT 201, 'S' FROM DUAL UNION ALL
SELECT 1145, 'FR' FROM DUAL UNION ALL
SELECT 300, 'E' FROM DUAL UNION ALL
SELECT 1145, 'R' FROM DUAL UNION ALL
SELECT 201, 'E' FROM DUAL UNION ALL
SELECT 201, 'S' FROM DUAL UNION ALL
SELECT 201, 'S' FROM DUAL UNION ALL
SELECT 1, 'A' FROM DUAL UNION ALL
SELECT 1, 'A' FROM DUAL UNION ALL
SELECT 1, 'B' FROM DUAL UNION ALL
SELECT 1, 'B' FROM DUAL UNION ALL
SELECT 1, 'C' FROM DUAL UNION ALL
SELECT 1, 'D' FROM DUAL;
Outputs:
FOLIO_NO
FLAG
1
A
1
B
201
S
300
E
1145
R
If you want only a single flag with the maximum occurrence for each folio, and if there are ties then the first folio alphabetically in each folio, then:
SELECT Folio_No, Flag
FROM (
SELECT Folio_No,
Flag,
ROW_NUMBER() OVER (PARTITION BY Folio_No ORDER BY COUNT(*) DESC, flag) AS rn
FROM table_name
GROUP BY Folio_No, Flag
)
WHERE rn = 1;
Which, for the sample data outputs:
FOLIO_NO
FLAG
1
A
201
S
300
E
1145
R
db<>fiddle here

ORACLE SQL | If a column contains a value, then it will exclude a different value from the same column

I have this query that returns the data below it
select LISTAGG(d.DOCUMENT_TYPE_CD, ',') WITHIN GROUP (ORDER BY D.DOCUMENT_TYPE_CD) as value
from test_table d;
VALUE
---------
CI,ECI,POA
now I'm trying to add a condition whenever 'ECI' value is present, it should exclude 'CI' in the result like this one below
VALUE
---------
ECI,POA
I tried using case statement in where condition it prompted an error
select LISTAGG(d.DOCUMENT_TYPE_CD, ',')
WITHIN GROUP (ORDER BY D.DOCUMENT_TYPE_CD) as value
from test_table d
where CASE d.DOCUMENT_TYPE_CD
WHEN 'ECI' THEN d.DOCUMENT_TYPE_CD <> 'CI'
END;
ORA-00905: missing keyword
00905. 00000 - "missing keyword"
*Cause:
*Action:
Error at Line: 7 Column: 36
is there any other way I could resolve this?

See if this helps; read comments within code.
SQL> with
2 test (id, document_type_cd) as
3 -- sample data
4 (select 1, 'ECI' from dual union all
5 select 1, 'CI' from dual union all
6 select 1, 'POA' from dual union all
7 --
8 select 2, 'CI' from dual union all
9 select 2, 'POA' from dual union all
10 --
11 select 3, 'XYZ' from dual union all
12 select 3, 'ABC' from dual
13 ),
14 temp as
15 -- see whether CI and ECI exist per each ID
16 (select id,
17 sum(case when document_type_cd = 'CI' then 1 else 0 end) sum_ci,
18 sum(case when document_type_cd = 'ECI' then 1 else 0 end) sum_eci
19 from test
20 group by id
21 ),
22 excl as
23 -- exclude CI rows if ECI exist for that ID
24 (select a.id,
25 a.document_type_cd
26 from test a join temp b on a.id = b.id
27 where a.document_type_cd <> case when b.sum_ci > 0 and b.sum_eci > 0 then 'CI'
28 else '-1'
29 end
30 )
31 -- finally:
32 select e.id,
33 listagg(e.document_type_cd, ',') within group (order by e.document_type_cd) result
34 from excl e
35 group by e.id;
ID RESULT
---------- --------------------
1 ECI,POA
2 CI,POA
3 ABC,XYZ
SQL>

Something like this:
select LISTAGG(d.DOCUMENT_TYPE_CD, ',')
WITHIN GROUP (ORDER BY D.DOCUMENT_TYPE_CD) as value
from test_table d,
(select sum (case when DOCUMENT_TYPE_CD = 'CI' then 1 else 0 end) C
from test_table) A
where d.DOCUMENT_TYPE_CD <> case when A.c > 0 then 'CI' when A.c = 0 then ' ' end;
DEMO

You may identify the presence of both the values with two conditional aggregations in the same group by and then replace CI inside the result of listagg in one pass.
with a(id, cd) as (
select 1, 'ABC' from dual union all
select 1, 'ECI' from dual union all
select 1, 'CI' from dual union all
select 1, 'POA' from dual union all
select 2, 'XYZ' from dual union all
select 2, 'ECI' from dual union all
select 2, 'CI' from dual union all
select 2, 'POA' from dual union all
select 3, 'CI' from dual union all
select 3, 'POA' from dual union all
select 4, 'ABC' from dual union all
select 4, 'DEF' from dual
)
select
id,
ltrim(
/*Added comma in case CI will be at the beginning*/
replace(
',' || listagg(cd, ',') within group (order by cd asc),
decode(
/*If both are present, then replace CI. If not, then do not replace anything*/
max(decode(cd, 'CI', 1))*max(decode(cd, 'ECI', 1)),
1,
',CI,'
),
','
),
','
) as res
from a
group by id
ID | RES
-: | :----------
1 | ABC,ECI,POA
2 | ECI,POA,XYZ
3 | CI,POA
4 | ABC,DEF
db<>fiddle here

Instead of using GROUP BY, you can also use windowing (aka analytic) functions to check the presence of ECI per group (test data shamelessly stolen from #littlefoot):
with
test (id, document_type_cd) as
-- sample data
(select 1, 'ECI' from dual union all
select 1, 'CI' from dual union all
select 1, 'POA' from dual union all
--
select 2, 'CI' from dual union all
select 2, 'POA' from dual union all
--
select 3, 'XYZ' from dual union all
select 3, 'ABC' from dual
),
temp as
(select id,
document_type_cd,
sum(case when document_type_cd = 'ECI' then 1 else 0 end) over (partition by id) as sum_eci
from test
)
select a.id,
listagg(a.document_type_cd, ',') within group (order by a.document_type_cd) result
from temp a
where a.document_type_cd != 'CI' or sum_eci = 0
group by a.id;

Oracle SQL row concatenation by periods: maximum period

I have the below table:
LAUFD
ID
NEXDT
ORDER_ROW
20140305
C1
20140310
14
20140226
C1
20140305
13
20131125
C1
20131126
12
20131021
C1
20131022
11
20130821
C1
20130828
10
20130814
C1
20130821
9
20130807
C1
20130814
8
20130731
C1
20130807
7
20130724
C1
20130731
6
20130710
C1
20130724
5
20130708
C1
20130709
4
20130624
C1
20130707
3
20130603
C1
20130608
2
20130527
C1
20130603
1
I would like to have the below output:
ID
START
END
C1
20140226
20140310
The logic is: if, ordering ID by order_row, the field NEXDT is equal or equal+1 or equal+2 to the field LAUFD of the next order_row, then continue with the next entry. If not, generate an entry in the output table with the start (earliest LAUFD) and end (latest NEXDT).
Basically, it's the same question as in Oracle SQL row concatenation by periods but I'd like just the latest period as an output.

Looks like this is what you need:
with t (LAUFD, ID, NEXDT, ORDER_ROW) as (
select 20140305,'C1', 20140310, 14 from dual union all
select 20140226,'C1', 20140305, 13 from dual union all
select 20131125,'C1', 20131126, 12 from dual union all
select 20131021,'C1', 20131022, 11 from dual union all
select 20130821,'C1', 20130828, 10 from dual union all
select 20130814,'C1', 20130821, 9 from dual union all
select 20130807,'C1', 20130814, 8 from dual union all
select 20130731,'C1', 20130807, 7 from dual union all
select 20130724,'C1', 20130731, 6 from dual union all
select 20130710,'C1', 20130724, 5 from dual union all
select 20130708,'C1', 20130709, 4 from dual union all
select 20130624,'C1', 20130707, 3 from dual union all
select 20130603,'C1', 20130608, 2 from dual union all
select 20130527,'C1', 20130603, 1 from dual
)
,t1 as (select id, order_row, to_date(laufd,'yyyymmdd') as laufd_dt, to_date(nexdt,'yyyymmdd') as nexdt_dt from t)
select *
from t1
match_recognize (
partition by id
order by order_row desc
measures
min(x.laufd_dt) as dt_start,
max(a.nexdt_dt) as dt_end,
x.laufd_dt-next(x.nexdt_dt) as dates_diff
one row per match
pattern(a x+ y* z*)
define
x as x.order_row=prev(order_row)-1 and prev(laufd_dt)-nexdt_dt<=3
,y as x.order_row=prev(order_row)-1
);

For just the latest period, you could use the previous solution. But instead, look for the first "break". Then only use the rows since that break;
select id, min(laufd), max(nextdt),
row_number() over (partition by id order by min(laufd)) as period
from (select t.*,
sum(case when prev_nextdt >= laufd - interval '2' day then 0 else 1 end) over
(partition by id order by order_row range desc) as grp,
sum(case when prev_nextdt >= laufd - interval '2' day then 0 else 1 end) over (partition by id) as num_grps
from (select t.id, t.order_row, -- any other columns you need
to_date(laufd, 'YYYYMMDD') as laufd,
to_date(nextdt, 'YYYYMMDD') as next_dt,
lag(to_date(nextdt, 'YYYYMMDD')) over (partition by id order by order_row) as prev_nextdt
from t
) t
) t
where num_grps = grp
group by id;
This is basically the same logic. It just keeps the first group.

Get distinct rows based on priority?

I have a table as below.i am using oracle 10g.
TableA
------
id status
---------------
1 R
1 S
1 W
2 R
i need to get distinct ids along with their status. if i query for distinct ids and their status i get all 4 rows.
but i should get only 2. one per id.
here id 1 has 3 distinct statuses. here i should get only one row based on priority.
first priority is to 'S' , second priority to 'W' and third priority to 'R'.
in my case i should get two records as below.
id status
--------------
1 S
2 R
How can i do that? Please help me.
Thanks!

select
id,
max(status) keep (dense_rank first order by instr('SWR', status)) as status
from TableA
group by id
order by 1
fiddle

select id , status from (
select TableA.*, ROW_NUMBER()
OVER (PARTITION BY TableA.id ORDER BY DECODE(
TableA.status,
'S',1,
'W',2,
'R',3,
4)) AS row_no
FROM TableA)
where row_no = 1

This is first thing i would do, but there may be a better way.
Select id, case when status=1 then 'S'
when status=2 then 'W'
when status=3 then 'R' end as status
from(
select id, max(case when status='S' then 3
when status='W' then 2
when status='R' then 1
end) status
from tableA
group by id
);

To get it done you can write a similar query:
-- sample of data from your question
SQL> with t1(id , status) as (
2 select 1, 'R' from dual union all
3 select 1, 'S' from dual union all
4 select 1, 'W' from dual union all
5 select 2, 'R' from dual
6 )
7 select id -- actual query
8 , status
9 from ( select id
10 , status
11 , row_number() over(partition by id
12 order by case
13 when upper(status) = 'S'
14 then 1
15 when upper(status) = 'W'
16 then 2
17 when upper(status) = 'R'
18 then 3
19 end
20 ) as rn
21 from t1
22 ) q
23 where q.rn = 1
24 ;
ID STATUS
---------- ------
1 S
2 R

select id,status from
(select id,status,decode(status,'S',1,'W',2,'R',3) st from table) where (id,st) in
(select id,min(st) from (select id,status,decode(status,'S',1,'W',2,'R',3) st from table))

Something like this???
SQL> with xx as(
2 select 1 id, 'R' status from dual UNION ALL
3 select 1, 'S' from dual UNION ALL
4 select 1, 'W' from dual UNION ALL
5 select 2, 'R' from dual
6 )
7 select
8 id,
9 DECODE(
10 MIN(
11 DECODE(status,'S',1,'W',2,'R',3)
12 ),
13 1,'S',2,'W',3,'R') "status"
14 from xx
15 group by id;
ID s
---------- -
1 S
2 R
Here, logic is quite simple.
Do a DECODE for setting the 'Priority', then find the MIN (i.e. one with Higher Priority) value and again DECODE it back to get its 'Status'

Using MOD() example with added values:
SELECT id, val, distinct_val
FROM
(
SELECT id, val
, ROW_NUMBER() OVER (ORDER BY id) row_seq
, MOD(ROW_NUMBER() OVER (ORDER BY id), 2) even_row
, (CASE WHEN id = MOD(ROW_NUMBER() OVER (ORDER BY id), 2) THEN NULL ELSE val END) distinct_val
FROM
(
SELECT 1 id, 'R' val FROM dual
UNION
SELECT 1 id, 'S' val FROM dual
UNION
SELECT 1 id, 'W' val FROM dual
UNION
SELECT 2 id, 'R' val FROM dual
UNION -- comment below for orig data
SELECT 3 id, 'K' val FROM dual
UNION
SELECT 4 id, 'G' val FROM dual
UNION
SELECT 1 id, 'W' val FROM dual
))
WHERE distinct_val IS NOT NULL
/
ID VAL DISTINCT_VAL
--------------------------
1 S S
2 R R
3 K K
4 G G

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to count consecutive duplicates in a table? - sql

Try to use row_number() select SLNO, Name, PG, row_number() over (partition by PG order by PG) as 'Consecutive' from <table> order by SLNO, NAME, PG This is should work with minor tweaking. --EDIT-- Sorry, partiton by PG. The partitioning tells the row_number when to start a new sequence.

Related

create date range from day based data

how to find the maximum occurence of a string in Oracle SQL developer

ORACLE SQL | If a column contains a value, then it will exclude a different value from the same column

Oracle SQL row concatenation by periods: maximum period

Get distinct rows based on priority?

Categories

Resources