split column value of a table and skip some words - sql

Hil All,
I have a table , count is about 200M. It has a column which contains data separated by '~'. I want to parse it.
e.g:
Column1
A~B~C~D~E~F
Result :
Column_new1
A~C~E
I just want to skip 2,4,6,n th. words. I don't want plsql. I need sql query. And table is very big,I also need performance.
I use substr,instr functions and I can parse. But it runs really slowly..
Thanks for help.

If you are after performance then use the INSTR and SUBSTR simple string functions:
SELECT SUBSTR(column1, 1, p1 - 1 ) || '~' ||
SUBSTR(column1, p2 + 1, p3 - p2 - 1) || '~' ||
SUBSTR(column1, p4 + 1, p5 - p4 - 1) AS column1_new
FROM (
SELECT column1,
INSTR(column1, '~', 1, 1) AS p1,
INSTR(column1, '~', 1, 2) AS p2,
INSTR(column1, '~', 1, 3) AS p3,
INSTR(column1, '~', 1, 4) AS p4,
INSTR(column1, '~', 1, 5) AS p5
FROM table_name
);
Which, for the sample data:
CREATE TABLE table_name (column1) AS
SELECT 'A~B~C~D~E~F' FROM DUAL;
Outputs:
COLUMN1_NEW
A~C~E
If you want a shorter query then you can use regular expressions:
SELECT REGEXP_REPLACE(column1, '([^~]+)~[^~]+~([^~]+)~[^~]+~([^~]+).*', '\1~\2~\3' )
AS column1_new
FROM table_name;
However, you will find that performance is likely to be an order of magnitude worse than simple string functions.
Another alternative would be to generate a materialized view.
db<>fiddle here

This is a regular expressions option. Looks nice, isn't PL/SQL, works OK (for 2 rows). I'm afraid that anything will run slow for 200 million rows.
SQL> with test (id, col) as
2 (select 1, 'A~B~C~D~E~F' from dual union all
3 select 2, 'M~N~O~P~Q-R' from dual
4 )
5 select id,
6 regexp_substr(col, '\w+', 1, 1) || '~' ||
7 regexp_substr(col, '\w+', 1, 3) || '~' ||
8 regexp_substr(col, '\w+', 1, 5) result
9 from test;
ID RESULT
---------- -----------------------------------
1 A~C~E
2 M~O~Q
SQL>

Related

convert string into rows with SQL

I have a string, like:
'one,two,three'
and I want to convert it in rows and use it into IN clause in an SQL:
one
two
three
I tried something like :
SELECT column_value
FROM XMLTable('"one","two","three"');
and it worked fine but in a join condition it fails.
SELECT 1
FROM dual
WHERE 'one' IN (SELECT column_value
FROM XMLTable('"one","two","three"'));
it gaves me the error:
ORA-00932: inconsistent datatypes: expected - got CHAR
00932. 00000 - "inconsistent datatypes: expected %s got %s"
Can anyone help me on this, please?
NOTE: I would like not use PLSQL
What you need is nothing but just casting a CLOB value to a [VAR]CHAR[2] data type such as
SELECT 1
FROM dual
WHERE 'one' IN (SELECT CAST(column_value AS VARCHAR2(20))
FROM XMLTable('"one","two","three"'))
1
---
1
in order to make it comparable with a literal(such as 'one').
Moreover, CAST might be replaceable with XMLCast as well.
You do not need XML to split the string and can use simple (fast) string functions:
WITH data (value) AS (
SELECT 'one,two,three' FROM DUAL
),
bounds (value, spos, epos) AS (
SELECT value, 1, INSTR(value, ',', 1)
FROM data
UNION ALL
SELECT value, epos + 1, INSTR(value, ',', epos + 1)
FROM bounds
WHERE epos > 0
)
SEARCH DEPTH FIRST BY value SET order_id
SELECT CASE epos
WHEN 0
THEN SUBSTR(value, spos)
ELSE SUBSTR(value, spos, epos-spos)
END as item
FROM bounds;
Which outputs:
ITEM
one
two
three
However
In your case you have an XY-problem and you DO NOT NEED to split the string as you can use LIKE to match the search string against a sub-string of your list:
SELECT 1
FROM dual
WHERE ',' || :your_list || ',' LIKE '%,' || :search_value || ',%';
or with hardcoded strings:
SELECT 1
FROM dual
WHERE ',' || 'one,two,three' || ',' LIKE '%,' || 'one' || ',%';
db<>fiddle here
Query that raised an error - if rewritten to this - works:
SQL> select 1
2 from dual
3 where 'one' in (select regexp_substr('one,two,three', '[^,]+', 1, level)
4 from dual
5 connect by level <= regexp_count('one,two,three', ',') + 1
6 );
1
----------
1
SQL>
Subquery (that uses regexp_substr) splits a comma-separated list of values ('one,two,three') into rows.
Alternatively, if you use Oracle Apex (or have it installed in your database), you can simplify it by utilizing apex_string.split:
SQL> select 1
2 from dual
3 where 'one' in (select * from apex_string.split('one,two,three', ','));
1
----------
1
SQL>
You could use REGEXP_COUNT to find your word in the list.
This would be faster than converting
But read this canonical thread Is storing a delimited list in a database column really that bad?
SELECT
CASE WHEN REGEXP_COUNT('one,two,three', '^([^,]+,)*one(,[^,]+)*$', 1, 'i') > 0
THEN 1 ELSE 0
END as cnt
FROM DUAL;
| CNT |
| --: |
| 1 |
db<>fiddle here

Filling in a table in the range between "L00000000" and "L00000010" in an oracle

I need help. I was once helped with a code to populate a table in the range between the first and last values. This code works fine, but there is one problem I can't figure out.
When I use as finite numbers like P1_FIRST = L4819222 and P1_LAST = L4819225, everything works fine. But when I use extreme chills as L00000000 and L00000010, I get: L1, L2, L3 ... but I wanted to see: `L00000000, L00000001, L00000002, etc.
Can you help me in this?
INSERT INTO test2 (val)
SELECT SUBSTR (:P1_FIRST, 1, 1)
|| TO_CHAR (
( TO_NUMBER (REGEXP_SUBSTR (:P1_FIRST, '\d+$'))
+ LEVEL
- 1))
AS val
FROM dual
CONNECT BY LEVEL <=
TO_NUMBER (
REGEXP_SUBSTR (:P1_LAST, '\d+$'))
- TO_NUMBER (
REGEXP_SUBSTR
SQL> select 'L' || lpad(level - 1, 8, '0') val
2 from dual
3 connect by level <= 11;
VAL
---------
L00000000
L00000001
L00000002
L00000003
L00000004
L00000005
L00000006
L00000007
L00000008
L00000009
L00000010
11 rows selected.
SQL>
As of INSERT and Apex items:
SQL> create table test (val varchar2(20));
Table created.
SQL> insert into test (val)
2 with subs as
3 (select to_number(regexp_substr('&P1_LAST' , '\d+$')) clast,
4 to_number(regexp_substr('&P1_FIRST', '\d+$')) cfirst
5 from dual
6 )
7 select 'L' || lpad(cfirst + level - 1, 8, '0') val
8 from subs
9 connect by level <= clast - cfirst + 1;
Enter value for p1_last: L4819225
Enter value for p1_first: L4819222
4 rows created.
SQL> /
Enter value for p1_last: L0000010
Enter value for p1_first: L0000000
11 rows created.
Result:
SQL> select * From test order by val;
VAL
--------------------
L00000000
L00000001
L00000002
L00000003
L00000004
L00000005
L00000006
L00000007
L00000008
L00000009
L00000010
L04819222
L04819223
L04819224
L04819225
15 rows selected.
SQL>
Query you'll use in Apex is
insert into test (val)
with subs as
(select to_number(regexp_substr(:P1_LAST , '\d+$')) clast,
to_number(regexp_substr(:P1_FIRST, '\d+$')) cfirst
from dual
)
select 'L' || lpad(cfirst + level - 1, 8, '0') val
from subs
connect by level <= clast - cfirst + 1;
You need using LPAD. Try:
SELECT SUBSTR (:P1_FIRST, 1, 1)
||
LPAD(TO_CHAR ( TO_NUMBER (REGEXP_SUBSTR (:P1_FIRST, '\d+$')) + level), 8,'0') AS val
FROM dual
connect by level < 10:
in your case:
SELECT SUBSTR ('L00000000', 1, 1) || lpad(TO_CHAR ( TO_NUMBER (REGEXP_SUBSTR ('L00000000', '\d+$' )) + LEVEL - 1), 8, '0') AS val
FROM dual
CONNECT BY LEVEL <= TO_NUMBER ( REGEXP_SUBSTR ('L00000010', '\d+$')) - TO_NUMBER ( REGEXP_SUBSTR ('L00000000', '\d+$')) + 1

How to sort version numbers (like 5.3.60.8)

I have a Strings like:
5.3.60.8
6.0.5.94
3.3.4.1
How to sort these values in sorting order in Oracle SQL?
I want the order to be like this:
6.0.5.94
5.3.60.8
3.3.4.1
with
inputs ( str ) as (
select '6.0.5.94' from dual union all
select '5.3.60.8' from dual union all
select '3.3.4.1' from dual
)
select str from inputs
order by to_number(regexp_substr(str, '\d+', 1, 1)),
to_number(regexp_substr(str, '\d+', 1, 2)),
to_number(regexp_substr(str, '\d+', 1, 3)),
to_number(regexp_substr(str, '\d+', 1, 4))
;
STR
--------
3.3.4.1
5.3.60.8
6.0.5.94
You could pad numbers with zeroes on the left in the order by clause:
select version
from versions
order by regexp_replace(
regexp_replace(version, '(\d+)', lpad('\1', 11, '0')),
'\d+(\d{10})',
'\1'
) desc
This works for more number parts as well, up to about 200 of them.
If you expect to have numbers with more than 10 digits, increase the number passed as second argument to the lpad function, and also the braced number in the second regular expression. The first should be one more (because \1 is two characters but could represent only one digit).
Highest version
To get the highest version only, you can add the row number to the query above with the special Oracle rownum keyword. Then wrap all that in an another select with a condition on that row number:
select version
from (
select version, rownum as row_num
from versions
order by regexp_replace(
regexp_replace(version, '(\d+)', lpad('\1', 11, '0')),
'\d+(\d{10})',
'\1'
) desc)
where row_num <= 1;
See this Q&A for several alternatives, also depending on your Oracle version.
I will show here the answer from AskTom, which can be used with different version size :
WITH inputs
AS (SELECT 1 as id, '6.0.5.94' as col FROM DUAL
UNION ALL
SELECT 2,'5.3.30.8' FROM DUAL
UNION ALL
SELECT 3,'5.3.4.8' FROM DUAL
UNION ALL
SELECT 4,'3' FROM DUAL
UNION ALL
SELECT 5,'3.3.40' FROM DUAL
UNION ALL
SELECT 6,'3.3.4.1.5' FROM DUAL
UNION ALL
SELECT 7,'3.3.4.1' FROM DUAL)
SELECT col, MAX (SYS_CONNECT_BY_PATH (v, '.')) p
FROM (SELECT t.col, TO_NUMBER (SUBSTR (x.COLUMN_VALUE, 1, 5)) r, SUBSTR (x.COLUMN_VALUE, 6) v, id rid
FROM inputs t,
TABLE (
CAST (
MULTISET (
SELECT TO_CHAR (LEVEL, 'fm00000')
|| TO_CHAR (TO_NUMBER (SUBSTR ('.' || col || '.', INSTR ('.' || col || '.', '.', 1, ROWNUM) + 1, INSTR ('.' || col || '.', '.', 1, ROWNUM + 1) - INSTR ('.' || col || '.', '.', 1, ROWNUM) - 1)), 'fm0000000000')
FROM DUAL
CONNECT BY LEVEL <= LENGTH (col) - LENGTH (REPLACE (col, '.', '')) + 1) AS SYS.odciVarchar2List)) x)
START WITH r = 1
CONNECT BY PRIOR rid = rid AND PRIOR r + 1 = r
GROUP BY col
ORDER BY p

Number formatting in Oracle SQL

I have some problem here. I got to SELECT a column which the result i.e. '01201698765'. How to split this number to becoming like this : '01.2016.98765'.
I've used TO_CHAR, but the '0' (zero) number at the front was gone.
You could use:
SUBSTR
concatenation operator ||
For example,
SQL> WITH sample_data AS(
2 SELECT '01201698765' num FROM dual
3 )
4 --end of sample_data mimicking real table
5 SELECT num,
6 substr(num, 1, 2)||'.'||substr(num, 3, 4)||'.'||substr(num, 7) num_formatted
7 FROM sample_data;
NUM NUM_FORMATTED
----------- -------------
01201698765 01.2016.98765
SQL>
Assuming the column is a string, just use string operations:
select substr(col, 1, 2) || '.' + substr(col, 3, 4) + '.' + substr(col, 5, 5)

SQL Concatenate strings across multiple columns with corresponding values

I'm looking for a way to achieve this in a SELECT statement.
FROM
Column1 Column2 Column3
A,B,C 1,2,3 x,y,z
TO
Result
A|1|x,B|2|y,C|3|z
The delimiters don't matter. I'm just trying to to get all the data in one single column. Ideally I am looking to do this in DB2. But I'd like to know if there's an easier way to get this done in Oracle.
Thanks
You can do it like this using INSTR and SUBSTR:
select
substr(column1,1,instr(column1,',',1)-1) || '|' ||
substr(column2,1,instr(column2,',',1)-1) || '|' ||
substr(column3,1,instr(column3,',',1)-1) || '|' ||
',' ||
substr(column1 ,instr(column1 ,',',1,1)+1,instr(column1 ,',',1,2) - instr(column1 ,',',1)-1) || '|' ||
substr(column2 ,instr(column2 ,',',1,1)+1,instr(column2 ,',',1,2) - instr(column2 ,',',1)-1) || '|' ||
substr(column3 ,instr(column3 ,',',1,1)+1,instr(column3 ,',',1,2) - instr(column3 ,',',1)-1) || '|' ||
',' ||
substr(column1 ,instr(column1 ,',',1,2)+1) || '|' ||
substr(column2 ,instr(column2 ,',',1,2)+1) || '|' ||
substr(column3 ,instr(column3 ,',',1,2)+1)
from yourtable
i tried some thing. just look into link
first i created a table called t_ask_test and inserted the data based on the above question. Achieved the result by using the string functions
sample table
create table t_ask_test(column1 varchar(10), column2 varchar(10),column3 varchar(10));
inserted a row
insert into T_ASK_TEST values ('A,B,C','1,2,3','x,y,z');
the following query will be in dynamic way
select substr(column1,1,instr(column1,',',1,1)-1)||'|'||substr(column2,1,instr(column1,',',1,1)-1)||'|'||substr(column3,1,instr(column1,',',1,1)-1) ||','||
substr(column1,instr(column1,',',1,1)+1,instr(column1,',',1,2)-instr(column1,',',1,1)-1)||'|'||substr(column2,instr(column2,',',1,1)+1,instr(column2,',',1,2)-instr(column2,',',1,1)-1)||'|'||substr(column3,instr(column3,',',1,1)+1,instr(column3,',',1,2)-instr(column3,',',1,1)-1) ||','||
substr(column1,instr(column1,',',1,2)+1,length(column1)-instr(column1,',',1,2))||'|'||substr(column2,instr(column2,',',1,2)+1,length(column2)-instr(column2,',',1,2))||'|'||substr(column3,instr(column3,',',1,2)+1,length(column3)-instr(column3,',',1,2)) as test from t_ask_test;
output will be as follows
TEST
---------------
A|1|x,B|2|y,C|3|z
If you have a dynamic number of entries for each row then:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TEST ( Column1, Column2, Column3 ) AS
SELECT 'A,B,C', '1,2,3', 'x,y,z' FROM DUAL
UNION ALL SELECT 'D,E', '4,5', 'v,w' FROM DUAL;
Query 1:
WITH ids AS (
SELECT t.*, ROWNUM AS id
FROM TEST t
)
SELECT LISTAGG(
REGEXP_SUBSTR( i.Column1, '[^,]+', 1, n.COLUMN_VALUE )
|| '|' || REGEXP_SUBSTR( i.Column2, '[^,]+', 1, n.COLUMN_VALUE )
|| '|' || REGEXP_SUBSTR( i.Column3, '[^,]+', 1, n.COLUMN_VALUE )
, ','
) WITHIN GROUP ( ORDER BY n.COLUMN_VALUE ) AS value
FROM ids i,
TABLE(
CAST(
MULTISET(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= GREATEST(
REGEXP_COUNT( i.COLUMN1, '[^,]+' ),
REGEXP_COUNT( i.COLUMN2, '[^,]+' ),
REGEXP_COUNT( i.COLUMN3, '[^,]+' )
)
)
AS SYS.ODCINUMBERLIST
)
) n
GROUP BY i.ID
Results:
| VALUE |
|-------------------|
| A|1|x,B|2|y,C|3|z |
| D|4|v,E|5|w |
You need to use:
SUBSTR
INSTR
|| concatenation operator
It would be easy if you break your output, and then understand how it works.
SQL> WITH t AS
2 ( SELECT 'A,B,C' Column1, '1,2,3' Column2, 'x,y,z' Column3 FROM dual
3 )
4 SELECT SUBSTR(column1, 1, instr(column1, ',', 1) -1)
5 ||'|'
6 || SUBSTR(column2, 1, instr(column2, ',', 1) -1)
7 ||'|'
8 || SUBSTR(column3, 1, instr(column1, ',', 1) -1)
9 ||','
10 || SUBSTR(column1, instr(column1, ',', 1, 2) +1 - instr(column1, ',', 1),
11 instr(column1, ',', 1) -1)
12 ||'|'
13 || SUBSTR(column2, instr(column2, ',', 1, 2) +1 - instr(column2, ',', 1),
14 instr(column2, ',', 1) -1)
15 ||'|'
16 || SUBSTR(column3, instr(column3, ',', 1, 2) +1 - instr(column3, ',', 1),
17 instr(column3, ',', 1) -1)
18 ||','
19 || SUBSTR(column1, instr(column1, ',', 1, 3) +1 - instr(column1, ',', 1),
20 instr(column1, ',', 2) -1)
21 as "new_column"
22 FROM t;
new_column
-------------
A|1|x,B|2|y,C
On a side note, you should avoid storing delimited values in a single column. Consider normalizing the data.
From Oracle 11g and above, you could create a VIRTUAL COLUMN using the above expression and use it instead of executing the SQL frequently.
Its very simple in oracle. just use the concatenation operatort ||.
In the below solution, I have used underscore as the delimiter
select Column1 ||'_'||Column2||'_'||Column3 from table_name;