Remove duplicate values from comma separated string in Oracle - sql

I need your help with the regexp_replace function. I have a table which has a column for concatenated string values which contain duplicates. How do I eliminate them?
Example:
Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha
I need the output to be
Ian,Beatty,Larry,Neesha
The duplicates are random and not in any particular order.
Update--
Here's how my table looks
ID Name1 Name2 Name3
1 a b c
1 c d a
2 d e a
2 c d b
I need one row per ID having distinct name1,name2,name3 in one row as a comma separated string.
ID Name
1 a,c,b,d,c
2 d,c,e,a,b
I have tried using listagg with distinct but I'm not able to remove the duplicates.

The easiest option I would go with -
SELECT ID, LISTAGG(NAME_LIST, ',')
FROM (SELECT ID, NAME1 NAME_LIST FROM DATA UNION
SELECT ID, NAME2 FROM DATA UNION
SELECT ID, NAME3 FROM DATA
)
GROUP BY ID;
Demo.

So, try this out...
([^,]+),(?=.*[A-Za-z],[] ]*\1)

I don't think you can do it just with regexp_replace if the repeated values are not next to each other. One approach is to split the values up, eliminate the duplicates, and then put them back together.
The common method to tokenize a delimited string is with regexp_substr and a connect by clause. Using a bind variable with your string to make the code a bit clearer:
var value varchar2(100);
exec :value := 'Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha';
select regexp_substr(:value, '[^,]+', 1, level) as value
from dual
connect by regexp_substr(:value, '[^,]+', 1, level) is not null;
VALUE
------------------------------
Ian
Beatty
Larry
Neesha
Beatty
Neesha
Ian
Neesha
You can use that as a subquery (or CTE), get the distinct values from it, then reassemble it with listagg:
select listagg(value, ',') within group (order by value) as value
from (
select distinct value from (
select regexp_substr(:value, '[^,]+', 1, level) as value
from dual
connect by regexp_substr(:value, '[^,]+', 1, level) is not null
)
);
VALUE
------------------------------
Beatty,Ian,Larry,Neesha
It's a bit more complicated if you're looking at multiple rows in a table as that confused the connect-by syntax, but you can use a non-determinisitic reference to avoid loops:
with t42 (id, value) as (
select 1, 'Ian,Beatty,Larry,Neesha,Beatty,Neesha,Ian,Neesha' from dual
union all select 2, 'Mary,Joe,Mary,Frank,Joe' from dual
)
select id, listagg(value, ',') within group (order by value) as value
from (
select distinct id, value from (
select id, regexp_substr(value, '[^,]+', 1, level) as value
from t42
connect by regexp_substr(value, '[^,]+', 1, level) is not null
and id = prior id
and prior dbms_random.value is not null
)
)
group by id;
ID VALUE
---------- ------------------------------
1 Beatty,Ian,Larry,Neesha
2 Frank,Joe,Mary
Of course this wouldn't be necessary if you were storing relational data properly; having a delimited string in a column is not a good idea.

There is a way to find duplicates in this case, but it is a problem to remove them if there are more than one duplicated name within a string per id. Here is code that can deal with one duplicate per id.
Sample data:
WITH
tbl AS
(
Select 1 "ID", 'a' "NAME_1", 'b' "NAME_2", 'c' "NAME_3" From Dual Union All
Select 1 "ID", 'c' "NAME_1", 'd' "NAME_2", 'a' "NAME_3" From Dual Union All
Select 2 "ID", 'd' "NAME_1", 'e' "NAME_2", 'a' "NAME_3" From Dual Union All
Select 2 "ID", 'c' "NAME_1", 'd' "NAME_2", 'b' "NAME_3" From Dual
),
lists AS
(
Select 1 "ID", 'a,c,b,d,c' "NAME" From Dual Union All
Select 2 "ID", 'd,c,e,a,b' "NAME" From Dual
),
Creating CTE that compares your LISTAGG sttring with original data finding duplicate values:
grid AS
(
Select DISTINCT l.ID, l.NAME,
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_1 || ',', '')) ) / Length(t.NAME_1 || ',') > 1 THEN NAME_1 END "NAME_1",
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_2 || ',', '')) ) / Length(t.NAME_2 || ',') > 1 THEN NAME_2 END "NAME_2",
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_3 || ',', '')) ) / Length(t.NAME_3 || ',') > 1 THEN NAME_3 END "NAME_3"
From
lists l
Inner Join
tbl t ON(t.ID = l.ID)
)
ID NAME NAME_1 NAME_2 NAME_3
---------- --------- ------ ------ ------
2 d,c,e,a,b
1 a,c,b,d,c c
1 a,c,b,d,c c
Main SQL, using Union, builds new string (removing second appearance) where the duplicate was found and then puts that new string after comparison with the old one.
SELECT DISTINCT l.ID, Nvl(g.NAME, l.NAME) NAME
FROM
lists l
LEFT JOIN
(
SELECT ID, CASE WHEN NAME_1 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_1, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_1, 1, 2) + Length(NAME_1)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
UNION ALL
SELECT ID, CASE WHEN NAME_2 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_2, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_2, 1, 2) + Length(NAME_2)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
UNION ALL
SELECT ID, CASE WHEN NAME_3 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_3, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_3, 1, 2) + Length(NAME_3)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
) g ON(g.ID = l.ID And Length(g.NAME) < Length(l.NAME))
R e s u l t :
ID NAME
---------- -------------
2 d,c,e,a,b
1 a,c,b,d
For multiple occurences within a string or for multiplicated different names there should be done some recursions or multiplied nestings to get it done...

Related

Is it possible to replace values in row to values from the same table by the reference in Oracle SQL

There is a table that contains values that are used in formulas. There are simple variables, that do not contain any expression, and also there are some variables that combined from simple variables into formula. I need to figure out if is it possible to do a SELECT query to get a readable formula based on aliases it contains. Each of these aliases could be used in other formulas.
Let's say that there are two tables:
ITEM TABLE
ID
Name
FORMULA_ID
1
Item name 1
f_3
2
Item name 2
f_26
FORMULA TABLE
ID
EXPRESSION
ALIASE
NAME
f_1
null
var_100
Ticket
f_2
null
var_200
Amount
f_3
var_100 * var_200
var_300
Some description
So is there any chance to query, with result like:
ITEM_NAME
READABLE_EXPRESSION
Item name 1
Ticket * Amount
Try this:
with items(ID,Name,Formula_Id) AS (
select 1, 'Item name 1', 'f_3' from dual union all
select 2, 'Item name 2', 'f_26' from dual
),
formulas (ID, EXPRESSION, ALIAS, NAME) as (
select 'f_1', null, 'var_100', 'Ticket' from dual union all
select 'f_2', null, 'var_200', 'Amount' from dual union all
select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' from dual
),
rnformulas (id, EXPRESSION, ALIAS, NAME, rn) as (
select fm.*, row_number() over(order by id) as rn from formulas fm
),
recsubstitute( lvl, item_id, rn, expression ) as (
select 1, it.id, 0, fm.expression
from items it
join rnformulas fm on it.formula_id = fm.id
union all
select lvl+1, item_id, fm.rn, replace(r.expression, fm.alias, fm.name)
from recsubstitute r
join rnformulas fm on instr(r.expression, fm.alias) > 0 and fm.rn > r.rn
)
select item_id, expression from (
select item_id, expression, row_number() over(partition by item_id order by lvl desc, rn asc) as rn
from recsubstitute
)
where rn = 1
;
ITEM_ID EXPRESSION
---------- ------------------------------------------------------------
1 Ticket * Amount
Note that it's far to be bullet proof against all situations, especially recursion in the aliases.
Some improvement with another set of data:
with items(ID,Name,Formula_Id) AS (
select 1, 'Item name 1', 'f_3' from dual union all
select 2, 'Item name 2', 'f_4' from dual
),
formulas (ID, EXPRESSION, ALIAS, NAME) as (
select 'f_1', null, 'var_100', 'Ticket' from dual union all
select 'f_2', null, 'var_200', 'Amount' from dual union all
select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' from dual union all
select 'f_4', 'var_300', null, 'Other description' from dual
),
rnformulas (id, EXPRESSION, ALIAS, NAME, rn) as (
select fm.*, row_number() over(order by id) as rn from formulas fm
),
recsubstitute( lvl, item_id, rn, expression ) as (
select 1, it.id, 0, fm.expression
from items it
join rnformulas fm on it.formula_id = fm.id
union all
select lvl+1, item_id, fm.rn, replace(r.expression, fm.alias, nvl(fm.expression,fm.name))
from recsubstitute r
join rnformulas fm on instr(r.expression, fm.alias) > 0
)
select item_id, expression from (
select item_id, expression, row_number() over(partition by item_id order by lvl desc, rn asc) as rn
from recsubstitute
)
where rn = 1
;
1 Ticket * Amount
2 Ticket * Amount
Split the space-delimited formulas into rows. Join the expression parts to the aliases and replace the alias with the name. Join this to the item_table using LISTAGG to concatenate the rows back into a single column.
WITH formula_split AS (
SELECT DISTINCT ft.id
,level lvl
,regexp_substr(ft.expression,'[^ ]+',1,level) expression_part
FROM formula_table ft
CONNECT BY ( ft.id = ft.id
AND level <= length(ft.expression) - length(replace(ft.expression,' ')) + 1 ) START WITH ft.expression IS NOT NULL
),readable_tbl AS (
SELECT ft.id
,ft.lvl
,replace(ft.expression_part,ftn1.aliase,ftn1.name) readable_expression
FROM formula_split ft
LEFT JOIN formula_table ftn1 ON ( ft.expression_part = ftn1.aliase )
)
SELECT it.name item_name
,LISTAGG(readable_expression,' ') WITHIN GROUP(ORDER BY lvl) readable_expression
FROM item_table it
JOIN readable_tbl rt ON ( it.formula_id = rt.id )
GROUP BY it.name
With sample data create CTE (calc_data) for modeling
WITH
items (ITEM_ID, ITEM_NAME, FORMULA_ID) AS
(
Select 1, 'Item name 1', 'f_3' From Dual Union All
Select 2, 'Item name 2', 'f_26' From Dual
),
formulas (FORMULA_ID, EXPRESSION, ALIAS, ELEMENT_NAME) AS
(
Select 'f_1', null, 'var_100', 'Ticket' From Dual Union All
Select 'f_2', null, 'var_200', 'Amount' From Dual Union All
Select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' From Dual
),
calc_data AS
( SELECT e.ITEM_NAME, e.FORMULA_ID, e.FORMULA, e.X, e.OPERAND, e.Y,
ROW_NUMBER() OVER(Partition By e.ITEM_NAME Order By e.FORMULA_ID) "RN", f.ELEMENT_NAME
FROM( Select CAST('.' as VARCHAR2(32)) "FORMULA", i.ITEM_NAME, f.FORMULA_ID,
SubStr(Replace(f.EXPRESSION, ' ', ''), 1, InStr(Replace(f.EXPRESSION, ' ', ''), '*') - 1) "X",
CASE
WHEN InStr(f.EXPRESSION, '+') > 0 THEN '+'
WHEN InStr(f.EXPRESSION, '-') > 0 THEN '-'
WHEN InStr(f.EXPRESSION, '*') > 0 THEN '*'
WHEN InStr(f.EXPRESSION, '/') > 0 THEN '/'
END "OPERAND",
--
SubStr(Replace(f.EXPRESSION, ' ', ''), InStr(Replace(f.EXPRESSION, ' ', ''), '*') + 1) "Y"
From formulas f
Inner Join items i ON(f.FORMULA_ID = i.FORMULA_ID)
) e
Inner Join formulas f ON(f.FORMULA_ID <> e.FORMULA_ID)
)
Main SQL with MODEL clause
SELECT ITEM_NAME, FORMULA
FROM ( SELECT *
FROM calc_data
MODEL
PARTITION BY (ITEM_NAME)
DIMENSION BY (RN)
MEASURES (X, OPERAND, Y, FORMULA, ELEMENT_NAME)
RULES ( FORMULA[1] = ELEMENT_NAME[1] || ' ' || OPERAND[1] || ' ' || ELEMENT_NAME[2] )
)
WHERE RN = 1
R e s u l t :
ITEM_NAME
FORMULA
Item name 1
Amount * Ticket
Just as an option...
The same result without any analytic functions, pseudo columns, unions, etc... - just selecting over and over and over. Not readable, though...
Select
i.ITEM_NAME,
REPLACE( REPLACE( (Select EXPRESSION From formulas Where FORMULA_ID = f.FORMULA_ID),
(Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID),
(Select ELEMENT_NAME From formulas Where FORMULA_ID <> f.FORMULA_ID And ALIAS = (Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) )
) ||
REPLACE( (Select EXPRESSION From formulas Where FORMULA_ID = f.FORMULA_ID),
(Select Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID),
(Select ELEMENT_NAME From formulas Where FORMULA_ID <> f.FORMULA_ID And ALIAS = (Select Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) )
),
(SELECT Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID ) || (Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) ||
SubStr(f.EXPRESSION, InStr(f.EXPRESSION, ' ', 1, 1), (InStr(f.EXPRESSION, ' ', 1, 2) - InStr(f.EXPRESSION, ' ', 1, 1)) + 1 ), ''
) "FORMULA"
From
formulas f
Left Join
items i ON(i.FORMULA_ID = f.FORMULA_ID)
Where i.ITEM_NAME Is Not Null
Thank you all for your answers!
I've decided to create a pl/sql function, just to modify a formula to readable row. So the function just looks for variables using regex, and uses indexes to replace every variable with a name.
CREATE OR REPLACE FUNCTION READABLE_EXPRESSION(inExpression IN VARCHAR2)
RETURN VARCHAR2
IS
matchesCount INTEGER;
toReplace VARCHAR2(32767);
readableExpression VARCHAR2(32767);
selectString VARCHAR2(32767);
BEGIN
matchesCount := REGEXP_COUNT(inExpression, '(var_)(.*?)');
IF matchesCount IS NOT NULL AND matchesCount > 0 THEN
readableExpression := inExpression;
FOR i in 1..matchesCount
LOOP
toReplace := substr(inExpression, REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 0),
REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 1) -
REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 0)
);
SELECT DISTINCT F.NAME
INTO selectString
FROM FORMULA F
WHERE F.ALIASE = toReplace FETCH FIRST 1 ROW ONLY;
readableExpression := REPLACE(readableExpression,
toReplace,
selectString
);
end loop;
end if;
return readableExpression;
END;
So such function returns 1 result row with replaced values for 1 input row with FORMULA. All you need to do is join the ITEM and FORMULA tables in the SELECT.
SELECT item.name, READABLE_EXPRESSION(formula.expression)
FROM item
JOIN formula ON item.formula_id = formula.id;
Please note that the tables are fictitious so as not to reveal the actual data structure, so there might be some inaccuracies. But the general idea should be clear.

Update based on the comma seperated value

I have the below Table with two columns both columns are VARCHAR2(100).
PARAM_NAME PARAM_VALUE
PlanName,DemandMonth EUMOCP,01-2022
PlanName,DemandMonth EUMOCP,02-2022
PlanName,DemandMonth EUMOCP,03-2022
PlanName,DemandMonth EUMOCP,04-2021
How can we write a update on the table so that it only updates the corresponding value.
For example:
Update DemandMonth from 01-2022 to 04-2022.
Provided it only updates the columns based on the first column
For instance,
Column A Column B
1,2 3,4
based on 1 we can update 3 as it is before ',' similarly based on 2 we can update 4.
What we want to achieve is the first it identifies where is 'DemandMonth' and then accordingly update the second column. Also if possible can we write it for 4 or 5 comma seperated values?
Don't store values in delimited strings.
Change your table so the values are:
CREATE TABLE params ( id, param_name, param_value ) AS
SELECT 1, 'PlanName', 'EUMOCP' FROM DUAL UNION ALL
SELECT 1, 'DemandMonth', '01-2022' FROM DUAL UNION ALL
SELECT 2, 'PlanName', 'EUMOCP' FROM DUAL UNION ALL
SELECT 2, 'DemandMonth', '02-2022' FROM DUAL UNION ALL
SELECT 3, 'PlanName', 'EUMOCP' FROM DUAL UNION ALL
SELECT 3, 'DemandMonth', '03-2022' FROM DUAL UNION ALL
SELECT 4, 'PlanName', 'EUMOCP' FROM DUAL UNION ALL
SELECT 4, 'DemandMonth', '04-2021' FROM DUAL;
Then all you need to do to update the value is:
UPDATE params
SET param_value = '04-2022'
WHERE param_name = 'DemandMonth'
AND param_value = '01-2022';
There is no worrying about where in the delimited string the value is and it is all simple.
You should not do this and should refactor your table to not use delimited strings... however, you can use:
MERGE INTO params dst
USING (
WITH items ( rid, param_names, param_values, name, value, lvl, max_lvl ) AS (
SELECT ROWID,
param_name,
param_value,
REGEXP_SUBSTR( param_name, '[^,]+', 1, 1 ),
REGEXP_SUBSTR( param_value, '[^,]+', 1, 1 ),
1,
REGEXP_COUNT( param_value, '[^,]+' )
FROM params
UNION ALL
SELECT rid,
param_names,
param_values,
REGEXP_SUBSTR( param_names, '[^,]+', 1, lvl + 1 ),
REGEXP_SUBSTR( param_values, '[^,]+', 1, lvl + 1 ),
lvl + 1,
max_lvl
FROM items
WHERE lvl < max_lvl
)
SELECT rid,
LISTAGG(
CASE
WHEN name = 'DemandMonth' AND value = '01-2022'
THEN '04-2022'
ELSE value
END,
','
) WITHIN GROUP ( ORDER BY lvl ) AS param_value
FROM items
GROUP BY rid
HAVING COUNT(
CASE
WHEN name = 'DemandMonth' AND value = '01-2022'
THEN 1
END
) > 0
) src
ON ( dst.ROWID = src.rid )
WHEN MATCHED THEN
UPDATE SET param_value = src.param_value;
Which, for the sample data:
CREATE TABLE params ( param_name, param_value ) AS
SELECT 'PlanName,DemandMonth', 'EUMOCP,01-2022' FROM DUAL UNION ALL
SELECT 'PlanName,DemandMonth', 'EUMOCP,02-2022' FROM DUAL UNION ALL
SELECT 'PlanName,DemandMonth', 'EUMOCP,03-2022' FROM DUAL UNION ALL
SELECT 'PlanName,DemandMonth', 'EUMOCP,04-2021' FROM DUAL;
Then:
SELECT * FROM params;
Outputs:
PARAM_NAME
PARAM_VALUE
PlanName,DemandMonth
EUMOCP,04-2022
PlanName,DemandMonth
EUMOCP,02-2022
PlanName,DemandMonth
EUMOCP,03-2022
PlanName,DemandMonth
EUMOCP,04-2021
db<>fiddle here

Convert a sequence of 0s and 1s to a print-style page list

I need to convert a string of 0s and 1s into a sequence of integers representing the 1s, similar to a page selection sequence in a print dialog.
e.g. '0011001110101' -> '3-4,7-9,11,13'
Is it possible to do this in a single SQL select (in Oracle 11g)?
I can get an individual list of the page numbers with the following:
with data as (
select 'K1' KEY, '0011001110101' VAL from dual
union select 'K2', '0101000110' from dual
union select 'K3', '011100011010' from dual
)
select
KEY,
listagg(ords.column_value, ',') within group (
order by ords.column_value
) PAGES
from
data
cross join (
table(cast(multiset(
select level
from dual
connect by level <= length(VAL)
) as sys.OdciNumberList)) ords
)
where
substr(VAL, ords.column_value, 1) = '1'
group by
KEY
But that doesn't do the grouping (e.g. returns "3,4,7,8,9,11,13" for the first value).
If I could assign a group number every time the value changes then I could use analytic functions to get the min and max for each group. I.e. if I could generate the following then I'd be set:
Key Page Val Group
K1 1 0 1
K1 2 0 1
K1 3 1 2
K1 4 1 2
K1 5 0 3
K1 6 0 3
K1 7 1 4
K1 8 1 4
K1 9 1 4
K1 10 0 5
K1 11 1 6
K1 12 0 7
K1 13 1 8
But I'm stuck on that.
Anyone have any ideas, or another approach to get this?
first of all let's level it:
select regexp_instr('0011001110101', '1+', 1, LEVEL) istr,
regexp_substr('0011001110101', '1+', 1, LEVEL) strlen
FROM dual
CONNECT BY regexp_substr('0011001110101', '1+', 1, LEVEL) is not null
then the rest is easy with listagg :
with data as
(
select 'K1' KEY, '0011001110101' VAL from dual
union select 'K2', '0101000110' from dual
union select 'K3', '011100011010' from dual
)
SELECT key,
(SELECT listagg(CASE
WHEN length(regexp_substr(val, '1+', 1, LEVEL)) = 1 THEN
to_char(regexp_instr(val, '1+', 1, LEVEL))
ELSE
regexp_instr(val, '1+', 1, LEVEL) || '-' ||
to_char(regexp_instr(val, '1+', 1, LEVEL) +
length(regexp_substr(val, '1+', 1, LEVEL)) - 1)
END,
' ,') within GROUP(ORDER BY regexp_instr(val, '1+', 1, LEVEL))
from dual
CONNECT BY regexp_substr(data.val, '1+', 1, LEVEL) IS NOT NULL) val
FROM data
Using a recursive sub-query factoring clause without regular expressions:
Oracle Setup:
CREATE TABLE data ( key, val ) AS
SELECT 'K1', '0011001110101' FROM DUAL UNION ALL
SELECT 'K2', '0101000110' FROM DUAL UNION ALL
SELECT 'K3', '011100011010' FROM DUAL UNION ALL
SELECT 'K4', '000000000000' FROM DUAL UNION ALL
SELECT 'K5', '000000000001' FROM DUAL;
Query:
WITH ranges ( key, val, pos, rng ) AS (
SELECT key,
val,
INSTR( val, '1', 1 ), -- Position of the first 1
NULL
FROM data
UNION ALL
SELECT key,
val,
INSTR( val, '1', INSTR( val, '0', pos ) ), -- Position of the next 1
rng || ',' || CASE
WHEN pos = LENGTH( val ) -- Single 1 at end-of-string
OR pos = INSTR( val, '0', pos ) - 1 -- 1 immediately followed by 0
THEN TO_CHAR( pos )
WHEN INSTR( val, '0', pos ) = 0 -- Multiple 1s until end-of-string
THEN pos || '-' || LENGTH( val )
ELSE pos || '-' || ( INSTR( val, '0', pos ) - 1 ) -- Normal range
END
FROM ranges
WHERE pos > 0
)
SELECT KEY,
VAL,
SUBSTR( rng, 2 ) AS rng -- Strip the leading comma
FROM ranges
WHERE pos = 0 OR val IS NULL
ORDER BY KEY;
Output
KEY VAL RNG
--- ------------- -------------
K1 0011001110101 3-4,7-9,11,13
K2 0101000110 2,4,8-9
K3 011100011010 2-4,8-9,11
K4 000000000000
K5 000000000001 12
Here is a slightly more efficient version of Isalamon's solution (using a hierarchical query). It is slightly more efficient because I use a single hierarchical query instead of multiple ones (in correlated subqueries), and I calculate the length of each sequence of 1's just once, in the inner query. (In fact it is calculated only once anyway, but the function call itself has some overhead.)
This version also treats inputs like '00000' and NULL correctly. Isalamon's solution doesn't, and MT0's solution does not return a row when the input value is NULL. It is not clear if NULL is even possible in the input data, and if it is, what the desired result is; I assumed a row should be returned, with the page_list NULL as well.
Optimizer cost for this version is 17, vs. 18 for Isalamon's solution and 33 for MT0's. However, optimizer cost doesn't take into account the significantly slower processing of regular expressions compared to standard string functions; if speed of execution is important, MT0's solution should definitely be tried since it may prove faster.
with data ( key, val ) as (
select 'K1', '0011001110101' from dual union all
select 'K2', '0101000110' from dual union all
select 'K3', '011100011010' from dual union all
select 'K4', '000000000000' from dual union all
select 'K5', '000000000001' from dual union all
select 'K6', null from dual union all
select 'K7', '1111111' from dual union all
select 'K8', '1' from dual
)
-- End of test data (not part of the solution); SQL query begins below this line.
select key, val,
listagg(case when len = 1 then to_char(s_pos)
when len > 1 then to_char(s_pos) || '-' || to_char(s_pos + len - 1)
end, ',') within group (order by lvl) as page_list
from ( select key, level as lvl, val,
regexp_instr(val, '1+', 1, level) as s_pos,
length(regexp_substr(val, '1+', 1, level)) as len
from data
connect by regexp_substr(val, '1+', 1, level) is not null
and prior key = key
and prior sys_guid() is not null
)
group by key, val
order by key
;
Output:
KEY VAL PAGE_LIST
--- ------------- -------------
K1 0011001110101 3-4,7-9,11,13
K2 0101000110 2,4,8-9
K3 011100011010 2-4,8-9,11
K4 000000000000
K5 000000000001 12
K6
K7 1111111 1-7
K8 1 1

get count of words in column sql

after the following queries
SELECT * FROM table;
SELECT REGEXP_REPLACE(description || '!', '[^[:punct:]]')
FROM table;
SELECT REGEXP_REPLACE ( description, '[' || REGEXP_REPLACE ( description || '!', '[^[:punct:]]') || ']') test
FROM table;
SELECT REGEXP_REPLACE(UPPER(TEST), ' ', '#') test
FROM (SELECT REGEXP_REPLACE (description, '[' || REGEXP_REPLACE (description || '!', '[^[:punct:]]') || ']') test
FROM table);
I have a column in an oracle sql looking like:
TEST
---------------------------------------------
SPOKE#WITH#MR#SMITHS#ASSISTANT
EMAILED#FOR#VISIT
SCHEDULING#OFFICE#LM#FOR#VISIT
LM#FOR#VISIT
LM#FOR#VISIT
PHONE#CALL
---------------------------------------------
all of the words are separated by #'s. I would like to get counts of the occurrences of words, for example:
word | count
------------
LM | 3
FOR | 4
VISIT| 4
PHONE| 1
etc etc. I'm new to oracle sql and am only familiar with rudimentary mysql commands. any help or pointers to tutorials would also be helpful. thank you.
edit: there are approximately 1500 rows with about 250 unique responses that i'm trying to account for
WITH mydata AS
( SELECT 'SPOKE#WITH#MR#SMITHS#ASSISTANT' AS str FROM dual
UNION ALL
SELECT 'EMAILED#FOR#VISIT' FROM dual
UNION ALL
SELECT 'SCHEDULING#OFFICE#LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'PHONE#CALL' FROM dual
),
splitted_words AS
(
SELECT REGEXP_SUBSTR(str,'[^#]+', 1, level) AS word
FROM mydata
CONNECT BY level <= LENGTH(regexp_replace(str,'[^#]')) + 1
AND PRIOR str = str
AND PRIOR sys_guid() IS NOT NULL
)
SELECT word,
COUNT(1)
FROM splitted_words
GROUP BY word;
If your table is YOUR_TABLE and column is YOUR_COLUMN
WITH splitted_words AS
(
SELECT REGEXP_SUBSTR(YOUR_COLUMN,'[^#]+', 1, level) AS word
FROM YOUR_TABLE
CONNECT BY level <= LENGTH(regexp_replace(YOUR_COLUMN,'[^#]')) + 1
AND PRIOR YOUR_COLUMN = YOUR_COLUMN
AND PRIOR sys_guid() IS NOT NULL
)
SELECT word,
COUNT(1)
FROM splitted_words
GROUP BY word;

Concatenate results from a SQL query in Oracle

I have data like this in a table
NAME PRICE
A 2
B 3
C 5
D 9
E 5
I want to display all the values in one row; for instance:
A,2|B,3|C,5|D,9|E,5|
How would I go about making a query that will give me a string like this in Oracle? I don't need it to be programmed into something; I just want a way to get that line to appear in the results so I can copy it over and paste it in a word document.
My Oracle version is 10.2.0.5.
-- Oracle 10g --
SELECT deptno, WM_CONCAT(ename) AS employees
FROM scott.emp
GROUP BY deptno;
Output:
10 CLARK,MILLER,KING
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,JAMES,TURNER,BLAKE,MARTIN,WARD
I know this is a little late but try this:
SELECT LISTAGG(CONCAT(CONCAT(NAME,','),PRICE),'|') WITHIN GROUP (ORDER BY NAME) AS CONCATDATA
FROM your_table
Usually when I need something like that quickly and I want to stay on SQL without using PL/SQL, I use something similar to the hack below:
select sys_connect_by_path(col, ', ') as concat
from
(
select 'E' as col, 1 as seq from dual
union
select 'F', 2 from dual
union
select 'G', 3 from dual
)
where seq = 3
start with seq = 1
connect by prior seq+1 = seq
It's a hierarchical query which uses the "sys_connect_by_path" special function, which is designed to get the "path" from a parent to a child.
What we are doing is simulating that the record with seq=1 is the parent of the record with seq=2 and so fourth, and then getting the full path of the last child (in this case, record with seq = 3), which will effectively be a concatenation of all the "col" columns
Adapted to your case:
select sys_connect_by_path(to_clob(col), '|') as concat
from
(
select name || ',' || price as col, rownum as seq, max(rownum) over (partition by 1) as max_seq
from
(
/* Simulating your table */
select 'A' as name, 2 as price from dual
union
select 'B' as name, 3 as price from dual
union
select 'C' as name, 5 as price from dual
union
select 'D' as name, 9 as price from dual
union
select 'E' as name, 5 as price from dual
)
)
where seq = max_seq
start with seq = 1
connect by prior seq+1 = seq
Result is: |A,2|B,3|C,5|D,9|E,5
As you're in Oracle 10g you can't use the excellent listagg(). However, there are numerous other string aggregation techniques.
There's no particular need for all the complicated stuff. Assuming the following table
create table a ( NAME varchar2(1), PRICE number);
insert all
into a values ('A', 2)
into a values ('B', 3)
into a values ('C', 5)
into a values ('D', 9)
into a values ('E', 5)
select * from dual
The unsupported function wm_concat should be sufficient:
select replace(replace(wm_concat (name || '#' || price), ',', '|'), '#', ',')
from a;
REPLACE(REPLACE(WM_CONCAT(NAME||'#'||PRICE),',','|'),'#',',')
--------------------------------------------------------------------------------
A,2|B,3|C,5|D,9|E,5
But, you could also alter Tom Kyte's stragg, also in the above link, to do it without the replace functions.
Here is another approach, using model clause:
-- sample of data from your question
with t1(NAME1, PRICE) as(
select 'A', 2 from dual union all
select 'B', 3 from dual union all
select 'C', 5 from dual union all
select 'D', 9 from dual union all
select 'E', 5 from dual
) -- the query
select Res
from (select name1
, price
, rn
, res
from t1
model
dimension by (row_number() over(order by name1) rn)
measures (name1, price, cast(null as varchar2(101)) as res)
(res[rn] order by rn desc = name1[cv()] || ',' || price[cv()] || '|' || res[cv() + 1])
)
where rn = 1
Result:
RES
----------------------
A,2|B,3|C,5|D,9|E,5|
SQLFiddle Example
Something like the following, which is grossly inefficient and untested.
create function foo returning varchar2 as
(
declare bar varchar2(8000) --arbitrary number
CURSOR cur IS
SELECT name,price
from my_table
LOOP
FETCH cur INTO r;
EXIT WHEN cur%NOTFOUND;
bar:= r.name|| ',' ||r.price || '|'
END LOOP;
dbms_output.put_line(bar);
return bar
)
Managed to get till here using xmlagg: using oracle 11G from sql fiddle.
Data Table:
COL1 COL2 COL3
1 0 0
1 1 1
2 0 0
3 0 0
3 1 0
SELECT
RTRIM(REPLACE(REPLACE(
XMLAgg(XMLElement("x", col1,',', col2, col3)
ORDER BY col1), '<x>'), '</x>', '|')) AS COLS
FROM ab
;
Results:
COLS
1,00| 3,00| 2,00| 1,11| 3,10|
* SQLFIDDLE DEMO
Reference to read on XMLAGG