Sum the numbers in a string in Oracle - sql

Below is the interview question, can some please help me resolve it?
select 'a1b2c3d4e5f6g7' from dual;
Output is sum of given integer number(1+2+3+4+5+6+7)=28.
Any help?

Use a Regex to keep only the numbers,then connect by to add each number
With T
as (select regexp_replace('a1b2c3d4e5f6g7', '[A-Za-z]') as col from dual)
select sum(val)
From
(
select substr(col,level,1) val from t connect by level <= length(col)
)
FIDDLE

Since it is only 1 digit numbers you can use SUBSTR() to extract every other character:
SQL Fiddle
Oracle 11g R2 Schema Setup:
Query 1:
WITH data ( value ) AS (
select 'a1b2c3d4e5f6g7' from dual
)
SELECT SUM( TO_NUMBER( SUBSTR( value, 2*LEVEL, 1 ) ) ) AS total
FROM data
CONNECT BY 2 * LEVEL <= LENGTH( value )
Results:
| TOTAL |
|-------|
| 28 |
However, if you have two digit numbers then you can do:
Query 2:
WITH data ( value ) AS (
select 'a1b2c3d4e5f6g7h8i9j10' from dual
)
SELECT SUM( TO_NUMBER( REGEXP_SUBSTR( value, '\d+', 1, LEVEL ) ) ) AS total
FROM data
CONNECT BY LEVEL <= REGEXP_COUNT( value, '\d+' )
Results:
| TOTAL |
|-------|
| 55 |

You can use regexp_substr to extract exactly the numbers, then just sum them:
with t as (select 'a1b2c3d4e5f6g7' expr from dual)
select sum(regexp_substr(t.expr, '[0-9]+',1, level)) as col
from dual
connect by level < regexp_instr(t.expr, '[0-9]+',1, level);
example:
select sum(regexp_substr('a1b2c3d4e5f6g7r22g4', '[0-9]+',1, level)) as col
from dual
connect by level < regexp_instr('a1b2c3d4e5f6g7r22g4', '[0-9]+',1, level);
Result:
54

This solution works with numbers with more than 1 digit and it doesn't matter how many characters are between the numbers:
with t as (select 'a1b2c3d4e5f6g7' as str from dual)
select sum(to_number(regexp_substr(str,'[0-9]+',1,level)))
from t
connect by regexp_substr(str,'[0-9]+',1,level) is not null

Related

Oracle SQL find missing sequence in varchar2 field

I am new to oracle and to this forum. I have searched and found answers on how to do this with a column of just numbers but this has txt at the beginning then a sequenced number.
I have a table that has a varchar2 column named myid which has characters with a number at the end which is in order the number at the end is always 6 digits with leading zeros.
Hello_002190
Hello_002188
Bye_000187
Bye_000185
Bye_000184
Get_008133
Get_008131
Gone_001112
Gone_001110
Gone_001109
I need an Oracle SQL script that will show me all the missing rows.
The result for the above should be:
Hello_002189
Bye_000186
Get_008132
Gone_001111
Thanks in advance for the help
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( value ) AS
SELECT 'Hello_002190' FROM DUAL UNION ALL
SELECT 'Hello_002188' FROM DUAL UNION ALL
SELECT 'Bye_000187' FROM DUAL UNION ALL
SELECT 'Bye_000185' FROM DUAL UNION ALL
SELECT 'Bye_000184' FROM DUAL UNION ALL
SELECT 'Get_008133' FROM DUAL UNION ALL
SELECT 'Get_008131' FROM DUAL UNION ALL
SELECT 'Gone_001112' FROM DUAL UNION ALL
SELECT 'Gone_001110' FROM DUAL UNION ALL
SELECT 'Gone_001109' FROM DUAL;
Query 1:
WITH data ( prefix, suffix ) AS (
SELECT SUBSTR( value, 1, INSTR( value, '_' ) ),
TO_NUMBER( SUBSTR( value, INSTR( value, '_' ) + 1 ) )
FROM table_name
),
bounds ( prefix, min_suffix, max_suffix ) AS (
SELECT prefix, MIN( suffix ), MAX( suffix )
FROM data
GROUP BY prefix
)
SELECT prefix || TO_CHAR( column_value, 'FM000000' ) AS missing_value
FROM bounds b
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT b.min_suffix + LEVEL - 1
FROM DUAL
CONNECT BY b.min_suffix + LEVEL - 1 <= b.max_suffix
) AS SYS.ODCINUMBERLIST
)
)
MINUS
SELECT value FROM table_name
Results:
| MISSING_VALUE |
|---------------|
| Bye_000186 |
| Get_008132 |
| Gone_001111 |
| Hello_002189 |

to find minimum missing number in oracle

i want to find the minimum missing number of a column named (s_no) and the table named (test_table) in oracle and I write the following code..
select
min_s_no-1+level missing_number
from (
select min(s_no) min_s_no, max(s_no) max_s_no
from test_table
) connect by level <= max_s_no-min_s_no+1
minus
select s_no from test_table
;
it gives me all the missing number as a result. But I want to select the minimum
number. Can any one help me please.
thanks in advance.
Using analytical function LEAD you can get the number from the next row in ascending order. Comparing of this value with with the original number increased by 1 you get the missing values (if two numbers do not match).
To get the first missing value in ascending order is the same selecting the MIN value:
select
num,
lead(num) over (order by num) num_lead,
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
order by num;
NUM NUM_LEAD MISSING_NUM
---------- ---------- -----------
4 5
5 6
6 9 7
9 10
10 13 11
13
-- first missing number = MIN missing number
select min(missing_num)
from (
select
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
);
MIN(MISSING_NUM)
----------------
7
ADDENDUM
A good practice in writing SQL is to consider edge cases - here a table that contains a complete interval without holes. The first missing value will be the successor of the last number.
select nvl(min(missing_num),max(num)+1) first_missing_value
from (
select
num,
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
);
A complete table return no MISSING_NUM, so the original query return NULL. Using the NVL the expected result is provided.
The best way to find the gaps is to use analytic functiions lead or lag. An example with lag:
with test_data as (
select 1 num from dual union all
select 4 from dual union all
select 6 from dual union all
select 8 from dual union all
select 3 from dual union all
select 9 from dual union all
select 0 from dual
)
select min(gap) min_gap
from (
select num, lag(num) over (order by num)+1 gap
from test_data
)
where num != gap
;
MIN_GAP
------------------
2
More about how to find the gaps here
In Oracle 12.1 and above, MATCH_RECOGNIZE can do quick work of this kind of problems:
Edited. Initially I was picking the "next number" where a gap exists (in the example, the value 9). But that is not what the OP wants, he wants the first missing number (7 in this case). I edited to change the measures clause, to find the first missing number as requested. End Edit
with test_data (num) as (
select 4 from dual union all
select 5 from dual union all
select 6 from dual union all
select 9 from dual union all
select 10 from dual union all
select 13 from dual
)
-- end of test data; when you use the SQL query below,
-- replace test_data and num with your actual table and column names.
select result as num
from test_data
match_recognize (
order by num
measures last(b.num) + 1 as result
pattern ( ^ a b* c )
define b as num = prev(num) + 1,
c as num > prev(num) + 1
)
;
NUM
---
7

Finding out the highest number in a comma separated string using Oracle SQL

I have a table with two columns:
OLD_REVISIONS |NEW_REVISIONS
-----------------------------------
1,25,26,24 |1,26,24,25
1,56,55,54 |1,55,54
1 |1
1,2 |1
1,96,95,94 |1,96,94,95
1 |1
1 |1
1 |1
1 |1
1,2 |1,2
1 |1
1 |1
1 |1
1 |1
For each row there will be a list of revisions for a document (comma separated)
The comma separated list might be the same in both columns but the order/sort might be different - e.g.
2,1 |1,2
I would like to find all the instances where the highest revision in the OLD_REVISIONS column is lower than than the highest revision in NEW_REVISIONS
The following would fit that criteria
OLD_REVISIONS |NEW_REVISIONS
-----------------------------------
1,2 |1
1,56,55,54 |1,55,54
I tried a solution using the MINUS option (joining the table to itself) but it returns differences even for when the list is the same but in the wrong order
I tried the function GREATEST (i.e where greatest(new_Revisions) < greatest(old_revisions)) but i am not sure why greatest(OLD_REVISIONS) always just returns the comma separated value. It does not return the max value. I suspect it is comparing strings because the columns are VARCHAR.
Also, MAX function expects a single number.
Is there another way i can achieve the above? I am looking for a pure SQL option so i can print out the results (or a PL/SQL option that can print out the results)
Edit
Apologies for not mentioning this but for the NEW_REVISIONS i do actually have the data in a table where each revision is in a separate row:
"DOCNUMBER" "REVISIONNUMBER"
67 1
67 24
67 25
67 26
75 1
75 54
75 55
75 56
78 1
79 1
79 2
83 1
83 96
83 94
Just to give some content, a few weeks ago i suspected that there are revisions disappearing.
To investigate this, i decided to take a count of all revisions for all documents and take a snapshot to compare later to see if revisions are indeed missing.
The snapshot that i took contained the following columns:
docnumber, count, revisions
The revisions were stored in a comma separated list using the listagg function.
The trouble i have now is the on live table, new revisions have been added so when i compare the main table and the snapshot using a MINUS i get a difference because
of the new revisions in the main table.
Even though in the actual table the revisions are individual rows, in the snapshot table i dont have the individual rows.
I am thinking the only way to recreate the snapshot in the same format and compare them find out if maximum revision in the main table is lower than the max revision in the snapshot table (hence why im trying to find out how to find out the max in a comma separated string)
Enjoy.
select xmlcast(xmlquery(('max((' || OLD_REVISIONS || '))') RETURNING CONTENT) as int) as OLD_REVISIONS_max
,xmlcast(xmlquery(('max((' || NEW_REVISIONS || '))') RETURNING CONTENT) as int) as NEW_REVISIONS_max
from t
;
Assuming your base table has an id column (versions of what?) - here is a solution based on splitting the rows.
Edit: If you like this solution, check out vkp's solution, which is better than mine. I explain why his solution is better in a Comment to his Answer.
with
t ( id, old_revisions, new_revisions ) as (
select 101, '1,25,26,24', '1,26,24,25' from dual union all
select 102, '1,56,55,54', '1,55,54' from dual union all
select 103, '1' , '1' from dual union all
select 104, '1,2' , '1' from dual union all
select 105, '1,96,95,94', '1,96,94,95' from dual union all
select 106, '1' , '1' from dual union all
select 107, '1' , '1' from dual union all
select 108, '1' , '1' from dual union all
select 109, '1' , '1' from dual union all
select 110, '1,2' , '1,2' from dual union all
select 111, '1' , '1' from dual union all
select 112, '1' , '1' from dual union all
select 113, '1' , '1' from dual union all
select 114, '1' , '1' from dual
)
-- END of TEST DATA; the actual solution (SQL query) begins below.
select id, old_revisions, new_revisions
from (
select id, old_revisions, new_revisions, 'old' as flag,
to_number(regexp_substr(old_revisions, '\d+', 1, level)) as rev_no
from t
connect by level <= regexp_count(old_revisions, ',') + 1
and prior id = id
and prior sys_guid() is not null
union all
select id, old_revisions, new_revisions, 'new' as flag,
to_number(regexp_substr(new_revisions, '\d+', 1, level)) as rev_no
from t
connect by level <= regexp_count(new_revisions, ',') + 1
and prior id = id
and prior sys_guid() is not null
)
group by id, old_revisions, new_revisions
having max(case when flag = 'old' then rev_no end) !=
max(case when flag = 'new' then rev_no end)
order by id -- ORDER BY is optional
;
ID OLD_REVISION NEW_REVISION
--- ------------ ------------
102 1,56,55,54 1,55,54
104 1,2 1
You can compare every value by putting together the revisions in the same order using listagg function.
SELECT listagg(o,',') WITHIN GROUP (ORDER BY o) old_revisions,
listagg(n,',') WITHIN GROUP (ORDER BY n) new_revisions
FROM (
SELECT DISTINCT rowid r,
regexp_substr(old_revisions, '[^,]+', 1, LEVEL) o,
regexp_substr(new_revisions, '[^,]+', 1, LEVEL) n
FROM table
WHERE regexp_substr(old_revisions, '[^,]+', 1, LEVEL) IS NOT NULL
CONNECT BY LEVEL<=(SELECT greatest(MAX(regexp_count(old_revisions,',')),MAX(regexp_count(new_revisions,',')))+1 c FROM table)
)
GROUP BY r
HAVING listagg(o,',') WITHIN GROUP (ORDER BY o)<>listagg(n,',') WITHIN GROUP (ORDER BY n);
This could be a way:
select
OLD_REVISIONS,
NEW_REVISIONS
from
REVISIONS t,
table(cast(multiset(
select level
from dual
connect by level <= length (regexp_replace(t.OLD_REVISIONS, '[^,]+')) + 1
) as sys.OdciNumberList
)
) levels_old,
table(cast(multiset(
select level
from dual
connect by level <= length (regexp_replace(t.NEW_REVISIONS, '[^,]+')) + 1
)as sys.OdciNumberList
)
) levels_new
group by t.ROWID,
OLD_REVISIONS,
NEW_REVISIONS
having max(to_number(trim(regexp_substr(t.OLD_REVISIONS, '[^,]+', 1, levels_old.column_value)))) >
max(to_number(trim(regexp_substr(t.new_REVISIONS, '[^,]+', 1, levels_new.column_value))))
This uses a double string split to pick the values from every field, and then simply finds the rows where the max values among the two collections match your requirement.
You should edit this by adding some unique key in the GROUP BYclause, or a rowid if you don't have any unique key on your table.
One way to do is to split the columns on comma separation using regexp_substr and checking if the max and min values are different.
Sample Demo
with rownums as (select t.*,row_number() over(order by old_revisions) rn from t)
select old_revisions,new_revisions
from rownums
where rn in (select rn
from rownums
group by rn
connect by regexp_substr(old_revisions, '[^,]+', 1, level) is not null
or regexp_substr(new_revisions, '[^,]+', 1, level) is not null
having max(cast(regexp_substr(old_revisions,'[^,]+', 1, level) as int))
<> max(cast(regexp_substr(new_revisions,'[^,]+', 1, level) as int))
)
Comments say normalise data. I agree but also I understand it may be not possible. I would try something like query below:
select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, rowid r from tab1
connect by regexp_substr(v1, '[^,]+', 1, level) is not null
) group by r) t1
inner join (
select max(val) val2, r from (
select regexp_substr(v2,'[^,]+', 1, level) val, rowid r from tab1
connect by regexp_substr(v2, '[^,]+', 1, level) is not null
) group by r) t2
on (t1.r = t2.r);
Tested on:
create table tab1 (v1 varchar2(100), v2 varchar2(100));
insert into tab1 values ('1,3,5','1,4,7');
insert into tab1 values ('1,3,5','1,2,9');
insert into tab1 values ('1,3,5','1,3,5');
insert into tab1 values ('1,3,5','1,4');
and seems to work fine. I left rowid for reference. I guess you have some id in table.
After your edit I would change query to:
select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, DOCNUMBER r from tab1
connect by regexp_substr(v1, '[^,]+', 1, level) is not null
) group by DOCNUMBER) t1
inner join (
select max(DOCNUMBER) val2, DOCNUMBER r from NEW_REVISIONS) t2
on (t1.r = t2.r);
You may write a PL/SQL function parsing the string and returning the maximal number
select max_num( '1,26,24,25') max_num from dual;
MAX_NUM
----------
26
The query ist than very simple:
select OLD_REVISIONS NEW_REVISIONS
from revs
where max_num(OLD_REVISIONS) < max_num(NEW_REVISIONS);
A prototyp function without validation and error handling
create or replace function max_num(str_in VARCHAR2) return NUMBER as
i number;
x varchar2(1);
n number := 0;
max_n number := 0;
pow number := 0;
begin
for i in 0.. length(str_in)-1 loop
x := substr(str_in,length(str_in)-i,1);
if x = ',' then
-- check max number
if n > max_n then
max_n := n;
end if;
-- reset
n := 0;
pow := 0;
else
n := n + to_number(x)*power(10,pow);
pow := pow +1;
end if;
end loop;
return(max_n);
end;
/

Turn to multiple records

I have records coming as below:
Item | Color Code
Bag | 1,2,3
How can I turn these record into:
Item | Color Code
Bag | 1
Bag | 2
Bag | 3
at SQL level without any intervention of new program.
I have problem to build a package and cube without this format of data
This will work
with t1(Item,ColorCode) as
(select 'Bag', '1,2,3' from dual)
select Item,regexp_substr(ColorCode, '[^,]+', 1, level) result
from t1
connect by level <= length(regexp_replace(ColorCode, '[^,]+')) + 1;
Try this:
SELECT t.Item,
trim(regexp_substr(t.ColorCode, '[^,]+', 1, lines.column_value)) ColorCode
FROM t,
TABLE (CAST (MULTISET
(SELECT LEVEL FROM dual CONNECT BY LEVEL <= regexp_count(t.ColorCode, ',')+1)
AS sys.odciNumberList
)
) lines
SQL FIDDLE DEMO

get count of words in column sql

after the following queries
SELECT * FROM table;
SELECT REGEXP_REPLACE(description || '!', '[^[:punct:]]')
FROM table;
SELECT REGEXP_REPLACE ( description, '[' || REGEXP_REPLACE ( description || '!', '[^[:punct:]]') || ']') test
FROM table;
SELECT REGEXP_REPLACE(UPPER(TEST), ' ', '#') test
FROM (SELECT REGEXP_REPLACE (description, '[' || REGEXP_REPLACE (description || '!', '[^[:punct:]]') || ']') test
FROM table);
I have a column in an oracle sql looking like:
TEST
---------------------------------------------
SPOKE#WITH#MR#SMITHS#ASSISTANT
EMAILED#FOR#VISIT
SCHEDULING#OFFICE#LM#FOR#VISIT
LM#FOR#VISIT
LM#FOR#VISIT
PHONE#CALL
---------------------------------------------
all of the words are separated by #'s. I would like to get counts of the occurrences of words, for example:
word | count
------------
LM | 3
FOR | 4
VISIT| 4
PHONE| 1
etc etc. I'm new to oracle sql and am only familiar with rudimentary mysql commands. any help or pointers to tutorials would also be helpful. thank you.
edit: there are approximately 1500 rows with about 250 unique responses that i'm trying to account for
WITH mydata AS
( SELECT 'SPOKE#WITH#MR#SMITHS#ASSISTANT' AS str FROM dual
UNION ALL
SELECT 'EMAILED#FOR#VISIT' FROM dual
UNION ALL
SELECT 'SCHEDULING#OFFICE#LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'PHONE#CALL' FROM dual
),
splitted_words AS
(
SELECT REGEXP_SUBSTR(str,'[^#]+', 1, level) AS word
FROM mydata
CONNECT BY level <= LENGTH(regexp_replace(str,'[^#]')) + 1
AND PRIOR str = str
AND PRIOR sys_guid() IS NOT NULL
)
SELECT word,
COUNT(1)
FROM splitted_words
GROUP BY word;
If your table is YOUR_TABLE and column is YOUR_COLUMN
WITH splitted_words AS
(
SELECT REGEXP_SUBSTR(YOUR_COLUMN,'[^#]+', 1, level) AS word
FROM YOUR_TABLE
CONNECT BY level <= LENGTH(regexp_replace(YOUR_COLUMN,'[^#]')) + 1
AND PRIOR YOUR_COLUMN = YOUR_COLUMN
AND PRIOR sys_guid() IS NOT NULL
)
SELECT word,
COUNT(1)
FROM splitted_words
GROUP BY word;