How to count each item in Oracle? - sql

Help, this is my query
SELECT distinct id, trim(regexp_substr(str, '[^,]+', 1, level)) str
FROM (SELECT id, REGEXP_REPLACE(to_char(advantages1), '"|\[|\]', '') str
FROM feedbacks) t
CONNECT BY instr(str, ',', 1, level - 1) > 0
order by str
and here's the result:
result
So, i want to make another column to count the result of each number in str. For example
STR TOTAL
110 1
111 2
112 2
113 4
114 1
How can i do that?

If I follow you correctly, you can use aggregation:
select trim(regexp_substr(str, '[^,]+', 1, level)) str , count(distinct id) total
from (select id, regexp_replace(to_char(advantages1), '"|\[|\]', '') str from feedbacks) t
connect by instr(str, ',', 1, level - 1) > 0
group by trim(regexp_substr(str, '[^,]+', 1, level))
order by str

If you just want to add a column to current SELECT-list, you'd need a COUNT() OVER() analytic function, but values of this column are repeated for each str value.
As looking at the expected result set, you want to distinctly count the result of the current query. The straightforward method is nesting this query :
SELECT str, COUNT(*) AS total
FROM ( <your_current_query_without_order_by_part> )
GROUP BY str
ORDER BY TO_NUMBER(str)

Related

Convert a sequence of 0s and 1s to a print-style page list

I need to convert a string of 0s and 1s into a sequence of integers representing the 1s, similar to a page selection sequence in a print dialog.
e.g. '0011001110101' -> '3-4,7-9,11,13'
Is it possible to do this in a single SQL select (in Oracle 11g)?
I can get an individual list of the page numbers with the following:
with data as (
select 'K1' KEY, '0011001110101' VAL from dual
union select 'K2', '0101000110' from dual
union select 'K3', '011100011010' from dual
)
select
KEY,
listagg(ords.column_value, ',') within group (
order by ords.column_value
) PAGES
from
data
cross join (
table(cast(multiset(
select level
from dual
connect by level <= length(VAL)
) as sys.OdciNumberList)) ords
)
where
substr(VAL, ords.column_value, 1) = '1'
group by
KEY
But that doesn't do the grouping (e.g. returns "3,4,7,8,9,11,13" for the first value).
If I could assign a group number every time the value changes then I could use analytic functions to get the min and max for each group. I.e. if I could generate the following then I'd be set:
Key Page Val Group
K1 1 0 1
K1 2 0 1
K1 3 1 2
K1 4 1 2
K1 5 0 3
K1 6 0 3
K1 7 1 4
K1 8 1 4
K1 9 1 4
K1 10 0 5
K1 11 1 6
K1 12 0 7
K1 13 1 8
But I'm stuck on that.
Anyone have any ideas, or another approach to get this?
first of all let's level it:
select regexp_instr('0011001110101', '1+', 1, LEVEL) istr,
regexp_substr('0011001110101', '1+', 1, LEVEL) strlen
FROM dual
CONNECT BY regexp_substr('0011001110101', '1+', 1, LEVEL) is not null
then the rest is easy with listagg :
with data as
(
select 'K1' KEY, '0011001110101' VAL from dual
union select 'K2', '0101000110' from dual
union select 'K3', '011100011010' from dual
)
SELECT key,
(SELECT listagg(CASE
WHEN length(regexp_substr(val, '1+', 1, LEVEL)) = 1 THEN
to_char(regexp_instr(val, '1+', 1, LEVEL))
ELSE
regexp_instr(val, '1+', 1, LEVEL) || '-' ||
to_char(regexp_instr(val, '1+', 1, LEVEL) +
length(regexp_substr(val, '1+', 1, LEVEL)) - 1)
END,
' ,') within GROUP(ORDER BY regexp_instr(val, '1+', 1, LEVEL))
from dual
CONNECT BY regexp_substr(data.val, '1+', 1, LEVEL) IS NOT NULL) val
FROM data
Using a recursive sub-query factoring clause without regular expressions:
Oracle Setup:
CREATE TABLE data ( key, val ) AS
SELECT 'K1', '0011001110101' FROM DUAL UNION ALL
SELECT 'K2', '0101000110' FROM DUAL UNION ALL
SELECT 'K3', '011100011010' FROM DUAL UNION ALL
SELECT 'K4', '000000000000' FROM DUAL UNION ALL
SELECT 'K5', '000000000001' FROM DUAL;
Query:
WITH ranges ( key, val, pos, rng ) AS (
SELECT key,
val,
INSTR( val, '1', 1 ), -- Position of the first 1
NULL
FROM data
UNION ALL
SELECT key,
val,
INSTR( val, '1', INSTR( val, '0', pos ) ), -- Position of the next 1
rng || ',' || CASE
WHEN pos = LENGTH( val ) -- Single 1 at end-of-string
OR pos = INSTR( val, '0', pos ) - 1 -- 1 immediately followed by 0
THEN TO_CHAR( pos )
WHEN INSTR( val, '0', pos ) = 0 -- Multiple 1s until end-of-string
THEN pos || '-' || LENGTH( val )
ELSE pos || '-' || ( INSTR( val, '0', pos ) - 1 ) -- Normal range
END
FROM ranges
WHERE pos > 0
)
SELECT KEY,
VAL,
SUBSTR( rng, 2 ) AS rng -- Strip the leading comma
FROM ranges
WHERE pos = 0 OR val IS NULL
ORDER BY KEY;
Output
KEY VAL RNG
--- ------------- -------------
K1 0011001110101 3-4,7-9,11,13
K2 0101000110 2,4,8-9
K3 011100011010 2-4,8-9,11
K4 000000000000
K5 000000000001 12
Here is a slightly more efficient version of Isalamon's solution (using a hierarchical query). It is slightly more efficient because I use a single hierarchical query instead of multiple ones (in correlated subqueries), and I calculate the length of each sequence of 1's just once, in the inner query. (In fact it is calculated only once anyway, but the function call itself has some overhead.)
This version also treats inputs like '00000' and NULL correctly. Isalamon's solution doesn't, and MT0's solution does not return a row when the input value is NULL. It is not clear if NULL is even possible in the input data, and if it is, what the desired result is; I assumed a row should be returned, with the page_list NULL as well.
Optimizer cost for this version is 17, vs. 18 for Isalamon's solution and 33 for MT0's. However, optimizer cost doesn't take into account the significantly slower processing of regular expressions compared to standard string functions; if speed of execution is important, MT0's solution should definitely be tried since it may prove faster.
with data ( key, val ) as (
select 'K1', '0011001110101' from dual union all
select 'K2', '0101000110' from dual union all
select 'K3', '011100011010' from dual union all
select 'K4', '000000000000' from dual union all
select 'K5', '000000000001' from dual union all
select 'K6', null from dual union all
select 'K7', '1111111' from dual union all
select 'K8', '1' from dual
)
-- End of test data (not part of the solution); SQL query begins below this line.
select key, val,
listagg(case when len = 1 then to_char(s_pos)
when len > 1 then to_char(s_pos) || '-' || to_char(s_pos + len - 1)
end, ',') within group (order by lvl) as page_list
from ( select key, level as lvl, val,
regexp_instr(val, '1+', 1, level) as s_pos,
length(regexp_substr(val, '1+', 1, level)) as len
from data
connect by regexp_substr(val, '1+', 1, level) is not null
and prior key = key
and prior sys_guid() is not null
)
group by key, val
order by key
;
Output:
KEY VAL PAGE_LIST
--- ------------- -------------
K1 0011001110101 3-4,7-9,11,13
K2 0101000110 2,4,8-9
K3 011100011010 2-4,8-9,11
K4 000000000000
K5 000000000001 12
K6
K7 1111111 1-7
K8 1 1

How to remove duplicates from space separated list by Oracle regexp_replace? [duplicate]

This question already has answers here:
How to remove duplicates from comma separated list by regexp_replace in Oracle?
(2 answers)
Closed 4 years ago.
I have a list called 'A B A A C D'. My expected result is 'A B C D'. So far from web I have found out
regexp_replace(l_user ,'([^,]+)(,[ ]*\1)+', '\1');
Expression. But this is for , separated list. What is the modification need to be done in order to make it space separated list. no need to consider the order.
If I understand well you don't simply need to replace ',' with a space, but also to remove duplicates in a smarter way.
If I modify that expression to work with space instead of ',', I get
select regexp_replace('A B A A C D' ,'([^ ]+)( [ ]*\1)+', '\1') from dual
which gives 'A B A C D', not what you need.
A way to get your needed result could be the following, a bit more complicated:
with string(s) as ( select 'A B A A C D' from dual)
select listagg(case when rn = 1 then str end, ' ') within group (order by lev)
from (
select str, row_number() over (partition by str order by 1) rn, lev
from (
SELECT trim(regexp_substr(s, '[^ ]+', 1, level)) str,
level as lev
FROM string
CONNECT BY instr(s, ' ', 1, level - 1) > 0
)
)
My main problem here is that I'm not able to build a regexp that checks for non adjacent duplicates, so I need to split the string, check for duplicates and then aggregate again the non duplicated values, keeping the order.
If you don't mind the order of the tokens in the result string, this can be simplified:
with string(s) as ( select 'A B A A C D' from dual)
select listagg(str, ' ') within group (order by 1)
from (
SELECT distinct trim(regexp_substr(s, '[^ ]+', 1, level)) as str
FROM string
CONNECT BY instr(s, ' ', 1, level - 1) > 0
)
Assuming you want to keep the component strings in the order of their first occurrence (and not, say, reorder them alphabetically - your example is poorly chosen in this regard, because both lead to the same result), the problem is more complicated, because you must keep track of order too. Then for each letter you must keep just the first occurrence - here is where row_number() helps.
with
inputs ( str ) as ( select 'A B A A C D' from dual)
-- end test data; solution begins below this line
select listagg(token, ' ') within group (order by id) as new_str
from (
select level as id, regexp_substr(str, '[^ ]+', 1, level) as token,
row_number() over (
partition by regexp_substr(str, '[^ ]+', 1, level)
order by level ) as rn
from inputs
connect by regexp_substr(str, '[^ ]+', 1, level) is not null
)
where rn = 1
;
Xquery?
select xmlquery('string-join(distinct-values(ora:tokenize(.," ")), " ")' passing 'A B A A C D' returning content) result from dual

Remove duplicate values from comma separated string in Oracle

I need your help with the regexp_replace function. I have a table which has a column for concatenated string values which contain duplicates. How do I eliminate them?
Example:
Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha
I need the output to be
Ian,Beatty,Larry,Neesha
The duplicates are random and not in any particular order.
Update--
Here's how my table looks
ID Name1 Name2 Name3
1 a b c
1 c d a
2 d e a
2 c d b
I need one row per ID having distinct name1,name2,name3 in one row as a comma separated string.
ID Name
1 a,c,b,d,c
2 d,c,e,a,b
I have tried using listagg with distinct but I'm not able to remove the duplicates.
The easiest option I would go with -
SELECT ID, LISTAGG(NAME_LIST, ',')
FROM (SELECT ID, NAME1 NAME_LIST FROM DATA UNION
SELECT ID, NAME2 FROM DATA UNION
SELECT ID, NAME3 FROM DATA
)
GROUP BY ID;
Demo.
So, try this out...
([^,]+),(?=.*[A-Za-z],[] ]*\1)
I don't think you can do it just with regexp_replace if the repeated values are not next to each other. One approach is to split the values up, eliminate the duplicates, and then put them back together.
The common method to tokenize a delimited string is with regexp_substr and a connect by clause. Using a bind variable with your string to make the code a bit clearer:
var value varchar2(100);
exec :value := 'Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha';
select regexp_substr(:value, '[^,]+', 1, level) as value
from dual
connect by regexp_substr(:value, '[^,]+', 1, level) is not null;
VALUE
------------------------------
Ian
Beatty
Larry
Neesha
Beatty
Neesha
Ian
Neesha
You can use that as a subquery (or CTE), get the distinct values from it, then reassemble it with listagg:
select listagg(value, ',') within group (order by value) as value
from (
select distinct value from (
select regexp_substr(:value, '[^,]+', 1, level) as value
from dual
connect by regexp_substr(:value, '[^,]+', 1, level) is not null
)
);
VALUE
------------------------------
Beatty,Ian,Larry,Neesha
It's a bit more complicated if you're looking at multiple rows in a table as that confused the connect-by syntax, but you can use a non-determinisitic reference to avoid loops:
with t42 (id, value) as (
select 1, 'Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha' from dual
union all select 2, 'Mary,Joe,Mary,Frank,Joe' from dual
)
select id, listagg(value, ',') within group (order by value) as value
from (
select distinct id, value from (
select id, regexp_substr(value, '[^,]+', 1, level) as value
from t42
connect by regexp_substr(value, '[^,]+', 1, level) is not null
and id = prior id
and prior dbms_random.value is not null
)
)
group by id;
ID VALUE
---------- ------------------------------
1 Beatty,Ian,Larry,Neesha
2 Frank,Joe,Mary
Of course this wouldn't be necessary if you were storing relational data properly; having a delimited string in a column is not a good idea.
There is a way to find duplicates in this case, but it is a problem to remove them if there are more than one duplicated name within a string per id. Here is code that can deal with one duplicate per id.
Sample data:
WITH
tbl AS
(
Select 1 "ID", 'a' "NAME_1", 'b' "NAME_2", 'c' "NAME_3" From Dual Union All
Select 1 "ID", 'c' "NAME_1", 'd' "NAME_2", 'a' "NAME_3" From Dual Union All
Select 2 "ID", 'd' "NAME_1", 'e' "NAME_2", 'a' "NAME_3" From Dual Union All
Select 2 "ID", 'c' "NAME_1", 'd' "NAME_2", 'b' "NAME_3" From Dual
),
lists AS
(
Select 1 "ID", 'a,c,b,d,c' "NAME" From Dual Union All
Select 2 "ID", 'd,c,e,a,b' "NAME" From Dual
),
Creating CTE that compares your LISTAGG sttring with original data finding duplicate values:
grid AS
(
Select DISTINCT l.ID, l.NAME,
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_1 || ',', '')) ) / Length(t.NAME_1 || ',') > 1 THEN NAME_1 END "NAME_1",
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_2 || ',', '')) ) / Length(t.NAME_2 || ',') > 1 THEN NAME_2 END "NAME_2",
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_3 || ',', '')) ) / Length(t.NAME_3 || ',') > 1 THEN NAME_3 END "NAME_3"
From
lists l
Inner Join
tbl t ON(t.ID = l.ID)
)
ID NAME NAME_1 NAME_2 NAME_3
---------- --------- ------ ------ ------
2 d,c,e,a,b
1 a,c,b,d,c c
1 a,c,b,d,c c
Main SQL, using Union, builds new string (removing second appearance) where the duplicate was found and then puts that new string after comparison with the old one.
SELECT DISTINCT l.ID, Nvl(g.NAME, l.NAME) NAME
FROM
lists l
LEFT JOIN
(
SELECT ID, CASE WHEN NAME_1 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_1, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_1, 1, 2) + Length(NAME_1)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
UNION ALL
SELECT ID, CASE WHEN NAME_2 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_2, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_2, 1, 2) + Length(NAME_2)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
UNION ALL
SELECT ID, CASE WHEN NAME_3 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_3, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_3, 1, 2) + Length(NAME_3)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
) g ON(g.ID = l.ID And Length(g.NAME) < Length(l.NAME))
R e s u l t :
ID NAME
---------- -------------
2 d,c,e,a,b
1 a,c,b,d
For multiple occurences within a string or for multiplicated different names there should be done some recursions or multiplied nestings to get it done...

Sum the numbers in a string in Oracle

Below is the interview question, can some please help me resolve it?
select 'a1b2c3d4e5f6g7' from dual;
Output is sum of given integer number(1+2+3+4+5+6+7)=28.
Any help?
Use a Regex to keep only the numbers,then connect by to add each number
With T
as (select regexp_replace('a1b2c3d4e5f6g7', '[A-Za-z]') as col from dual)
select sum(val)
From
(
select substr(col,level,1) val from t connect by level <= length(col)
)
FIDDLE
Since it is only 1 digit numbers you can use SUBSTR() to extract every other character:
SQL Fiddle
Oracle 11g R2 Schema Setup:
Query 1:
WITH data ( value ) AS (
select 'a1b2c3d4e5f6g7' from dual
)
SELECT SUM( TO_NUMBER( SUBSTR( value, 2*LEVEL, 1 ) ) ) AS total
FROM data
CONNECT BY 2 * LEVEL <= LENGTH( value )
Results:
| TOTAL |
|-------|
| 28 |
However, if you have two digit numbers then you can do:
Query 2:
WITH data ( value ) AS (
select 'a1b2c3d4e5f6g7h8i9j10' from dual
)
SELECT SUM( TO_NUMBER( REGEXP_SUBSTR( value, '\d+', 1, LEVEL ) ) ) AS total
FROM data
CONNECT BY LEVEL <= REGEXP_COUNT( value, '\d+' )
Results:
| TOTAL |
|-------|
| 55 |
You can use regexp_substr to extract exactly the numbers, then just sum them:
with t as (select 'a1b2c3d4e5f6g7' expr from dual)
select sum(regexp_substr(t.expr, '[0-9]+',1, level)) as col
from dual
connect by level < regexp_instr(t.expr, '[0-9]+',1, level);
example:
select sum(regexp_substr('a1b2c3d4e5f6g7r22g4', '[0-9]+',1, level)) as col
from dual
connect by level < regexp_instr('a1b2c3d4e5f6g7r22g4', '[0-9]+',1, level);
Result:
54
This solution works with numbers with more than 1 digit and it doesn't matter how many characters are between the numbers:
with t as (select 'a1b2c3d4e5f6g7' as str from dual)
select sum(to_number(regexp_substr(str,'[0-9]+',1,level)))
from t
connect by regexp_substr(str,'[0-9]+',1,level) is not null

Extract words from a comma separated string in oracle

Suppose I have string
Str = 'Aaa,Bbb,Abb,Ccc'
I want to separate the above str in two parts as follows
Str1 = 'Aaa,Abb'
Str2 = 'Bbb,Ccc'
That is any word in str starting with A should go in str1 rest all in str2.
How can I achieve this using Oracle queries?
That is any word in str starting with A should go in str1 rest all in str2.
To achieve it in pure SQL, I will use the following:
REGEXP_SUBSTR
LISTAGG
SUBSTR
INLINE VIEW
So, first I will split the comma delimited string using the techniques as demonstrated here Split single comma delimited string into rows.
And then, I will aggregate them using LISTAGG in an order.
For example,
SQL> WITH
2 t1 AS (
3 SELECT 'Aaa,Bbb,Abb,Ccc' str FROM dual
4 ),
5 t2 AS (
6 SELECT trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
7 FROM t1
8 CONNECT BY LEVEL <= regexp_count(str, ',')+1
9 ORDER BY str
10 )
11 SELECT
12 (SELECT listagg(str, ',') WITHIN GROUP(
13 ORDER BY NULL) str1
14 FROM t2
15 WHERE SUBSTR(str, 1, 1)='A'
16 ) str1,
17 (SELECT listagg(str, ',') WITHIN GROUP(
18 ORDER BY NULL) str
19 FROM t2
20 WHERE SUBSTR(str, 1, 1)<>'A'
21 ) str2
22 FROM dual
23 /
STR1 STR2
---------- ----------
Aaa,Abb Bbb,Ccc
SQL>
The WITH clause is just for demonstration purpose, in your real scenario, remove the with clause and use you table name directly. Though it looks neat using the WITH clause.
Use regext expression and ListAg function.
NOTE: LISTAGG function is available since Oracle 11g!
select listagg(s.name, ',') within group (order by name)
from (select regexp_substr('Aaa,Bbb,Abb,Ccc,Add,Ddd','[^,]+', 1, level) name from dual
connect by regexp_substr('Aaa,Bbb,Abb,Ccc,Add,Ddd', '[^,]+', 1, level) is not null) s
group by decode(substr(name,1,1),'A', 1, 0);
This query gives you the desired output in two different rows:
with temp as (select trim (both ',' from 'Aaa,Bbb,Abb,Ccc') as str from dual),
base_table as
( select trim (regexp_substr (t.str,
'[^' || ',' || ']+',
1,
level))
str
from temp t
connect by instr (str,
',',
1,
level - 1) > 0),
ult_table as
(select str,
case upper (substr (str, 1, 1)) when 'A' then 1 else 2 end
as l
from base_table)
select listagg (case when l = 1 then str else null end, ',')
within group (order by str)
str1,
listagg (case when l = 2 then str else null end, ',')
within group (order by str)
str2
from ult_table;
Output
L STR
---------- --------------------------------------------------------------------------------
1 Aaa,Abb
2 Bbb,Ccc