Find space after nth characters and split into new row - sql

I have a large string stored in table as a single line. I need a select query to split the large string to rows after every 100 characters and it should not split in middle of the word. Basically, the query should find a space after 100 characters and split into new line.
I have used this query, it is splitting after 100 lines, but it is breaking in the middle of words.
SELECT REGEXP_REPLACE ( col_large_string , '(.{100})' , '\1' || CHR (10) ) AS split_to_rows
FROM tab_large_string where string_id = 1;

You do not need (slow) regular expressions and can do it with simple (quicker) string functions.
If you want to replace spaces with newlines then:
WITH bounds ( str, end_pos ) AS (
SELECT col_large_string,
INSTR(col_large_string, ' ', 101)
FROM tab_large_string
UNION ALL
SELECT SUBSTR(str, 1, end_pos - 1)
|| CHR(10)
|| SUBSTR(str, end_pos + 1),
INSTR(str, ' ', end_pos + 101)
FROM bounds
WHERE end_pos > 0
)
SELECT str AS split_to_lines
FROM bounds
WHERE end_pos = 0;
and if you want to have each line in a new row then:
WITH bounds ( str, start_pos, end_pos ) AS (
SELECT col_large_string,
1,
INSTR(col_large_string, ' ', 101)
FROM tab_large_string
UNION ALL
SELECT str,
end_pos + 1,
INSTR(str, ' ', end_pos + 101)
FROM bounds
WHERE end_pos > 0
)
SELECT CASE end_pos
WHEN 0
THEN SUBSTR(str, start_pos)
ELSE SUBSTR(str, start_pos, end_pos - start_pos)
END AS split_to_rows
FROM bounds;
If you do want to use regular expressions then:
SELECT REGEXP_REPLACE(
col_large_string,
'(.{100,}?) ',
'\1' || CHR (10)
) AS split_to_lines
FROM tab_large_string
WHERE string_id = 1;
db<>fiddle here

You can use this regular expression:
SELECT REGEXP_REPLACE ( col_large_string , '((\w+\s+){100})' , '\1' || CHR (10) ) AS split_to_rows
FROM tab_large_string where string_id = 1;
\w+ matches one or more occurrence of word character.
\s+ matches one or more occurrence of space character.
(\w+\s+) matches a word followed by space
(\w+\s+){100} then matches (a word followed by space) x100.

Related

How to Select a substring in Oracle SQL from and up to some specific characters?

i am using oracle sql. i would like to substr starting from characters XY0 and include 2 or 3 more characters until '-' sign in the string
These characters may be anywhere in the string.
Original
column_value
1st Row - Error due to XY0066- Does not fit -Not suitable
2nd Row -Error due to specific XY0089- Will not match
3rd Row -Not in good cond XY0215- Special type error
Extraction should be
result
XY0066
XY0089
XY0215
How can I do this?
You can use:
SELECT id,
SUBSTR(value, start_pos, end_pos - start_pos) AS code
FROM (
SELECT id,
value,
INSTR(value, 'XY') AS start_pos,
INSTR(value, '-', INSTR(value, 'XY') + 2) AS end_pos
FROM table_name
);
or
SELECT id,
SUBSTR(
value,
INSTR(value, 'XY'),
INSTR(value, '-', INSTR(value, 'XY') + 2) - INSTR(value, 'XY')
) AS code
FROM table_name;
or using regular expressions, which is shorter to type but will run much slower:
SELECT id,
REGEXP_SUBSTR(value, 'XY[^-]*') AS code
FROM table_name;
Which, for the sample data:
CREATE TABLE table_name (id, value) AS
SELECT 1, 'Error due to XY0066- Does not fit -Not suitable' FROM DUAL UNION ALL
SELECT 2, 'Error due to specific XY0089- Will not match' FROM DUAL UNION ALL
SELECT 3, 'Not in good cond XY0215- Special type error' FROM DUAL;
All output:
ID
CODE
1
XY0066
2
XY0089
3
XY0215
fiddle

Extract the second word from a string in ODI Expression

This two syntaxes allow to get the scond word from a string in oracle
SELECT REGEXP_SUBSTR('Hello this is an example', '\s+(\w+)\s') AS syntax1,
SUBSTR('Hello this is an example',
INSTR('Hello this is an example', ' ', 1, 1) + 1,
INSTR('Hello this is an example', ' ', 1, 2)
- INSTR('Hello this is an example', ' ', 1)
) AS syntax2
FROM dual;
Result:
syntax1 syntax2
------- -------
this this
I'm working in ODI (oracle data integration), this two syntaxes doesn't work in ODI:
For ODI, the regexp is not valid and INSTR function accepts only 2 parameters
Can you suggest me a solution that can work in ODI?
Thank you.
I think it should support ' [[:alpha:]]+ '
Then, you can apply SUBSTR() function twice :
WITH t2(str) AS
(
SELECT SUBSTR( TRIM( str ), INSTR( TRIM( str ), ' ') + 1, LENGTH( TRIM(str) ) )
FROM t --> original table
)
SELECT SUBSTR( str, 1, INSTR(str, ' ') - 1 ) AS extracted_string
FROM t2
extracted_string
----------------
this
If the version of installed ODI is 12+, then you can also use REGEXP_REPLACE() as below one :
SELECT REGEXP_REPLACE(str, '(\w+)\s(\w+)( .*)', '\2' ) AS extracted_string
FROM t
Demo
I finaly used this expression:
SELECT
SUBSTR (
SUBSTR ('one two three four',
INSTR ('one two three four', ' ') + 1,
999999),
0,
INSTR (
SUBSTR ('one two three four',
INSTR ('one two three four', ' ') + 1,
999999),
' ')
- 1)
FROM DUAL

change the order of a string in pl/sql

I want to change the order of a string like :
name/surname by surname/name OR name/ surname by surname/ name (taking account on spaces after/)
below my request.but the second part is wrong :
select SUBSTR ('name/surname' , INSTR ('name/surname','/')+1) ||SUBSTR ('name/surname' , 1,INSTR('name/surname','/')-1)
from dual
Using a regular expression:
SELECT REGEXP_REPLACE( 'name/surname', '^(.*?)/(.*)$', '\2/\1' ) FROM DUAL;
or trimming all whitespaces:
SELECT REGEXP_REPLACE( ' name / surname ', '^\s*(.*?)\s*/\s*(.*?)\s*$', '\2/\1' ) FROM DUAL;
or preserving a whitespace before and after the slash (and trimming leading/training whitespace):
SELECT REGEXP_REPLACE( ' name / surname ', '^\s*(.*?)(\s?/\s?)(.*?)\s*$', '\3\2\1' ) FROM DUAL;
Using string functions:
WITH names ( text ) AS (
SELECT 'name/surname' FROM DUAL
)
SELECT SUBSTR( text, INSTR( text, '/' ) + 1 ) || '/' || SUBSTR( text, 1, INSTR( text, '/' ) - 1 )
FROM names;
or trimming whitespace:
WITH names ( text ) AS (
SELECT ' name / surname ' FROM DUAL
)
SELECT TRIM( SUBSTR( text, INSTR( text, '/' ) + 1 ) ) || '/' || TRIM( SUBSTR( text, 1, INSTR( text, '/' ) - 1 ) )
FROM names;
select substr('firstname/lastname',instr('firstname/lastname','/')+1)||'/'
|| substr('firstname/lastname',1,instr('firstname/lastname','/')-1) from dual;

Escaping special characters for JSON output

I have a column that contains data that I want to escape in order to use it as JSON output, to be more precise am trying to escape the same characters listed here but using Oracle 11g: Special Characters and JSON Escaping Rules
I think it can be solved using REGEXP_REPLACE:
SELECT REGEXP_REPLACE(my_column, '("|\\|/)|(' || CHR(9) || ')', '\\\1') FROM my_table;
But I am lost about replacing the other characters (tab, new line, backspace, etc), in the previous example I know that \1 will match and replace the first group but I am not sure how to capture the tab in the second group and then replace it with \t. Somebody could give me a hint about how to do the replacement?
I know I can do this:
SELECT REGEXP_REPLACE( REGEXP_REPLACE(my_column, '("|\\|/)', '\\\1'), '(' || CHR(9) || ')', '\t')
FROM my_table;
But I would have to nest like 5 calls to REGEXP_REPLACE, and I suspect I should be able to do it in just one or two calls.
I am aware about other packages or libraries for JSON but I think this case is simple enough that it can be solved with the functions that Oracle offers out-of-the-box.
Thank you.
Here's a start. Replacing all the regular characters is easy enough, it's the control characters that will be tricky. This method uses a group consisting of a character class that contains the characters you want to add the backslash in front of. Note that characters inside of the class do not need to be escaped. The argument to REGEXP_REPLACE of 1 means start at the first position and the 0 means to replace all occurrences found in the source string.
SELECT REGEXP_REPLACE('t/h"is"'||chr(9)||'is a|te\st', '([/\|"])', '\\\1', 1, 0) FROM dual;
Replacing the TAB and a carriage return is easy enough by wrapping the above in REPLACE calls, but it stinks to have to do this for each control character. Thus, I'm afraid my answer isn't really a full answer for you, it only helps you with the regular characters a bit:
SQL> SELECT REPLACE(REPLACE(REGEXP_REPLACE('t/h"is"'||chr(9)||'is
2 a|te\st', '([/\|"])', '\\\1', 1, 0), chr(9), '\t'), chr(10), '\n') fixe
3 FROM dual;
FIXED
-------------------------
t\/h\"is\"\tis\na\|te\\st
SQL>
EDIT: Here's a solution! I don't claim to understand it fully, but basically it creates a translation table that joins to your string (in the inp_str table). The connect by, level traverses the length of the string and replaces characters where there is a match in the translation table. I modified a solution found here: http://database.developer-works.com/article/14901746/Replace+%28translate%29+one+char+to+many that really doesn't have a great explanation. Hopefully someone here will chime in and explain this fully.
SQL> with trans_tbl(ch_frm, str_to) as (
select '"', '\"' from dual union
select '/', '\/' from dual union
select '\', '\\' from dual union
select chr(8), '\b' from dual union -- BS
select chr(12), '\f' from dual union -- FF
select chr(10), '\n' from dual union -- NL
select chr(13), '\r' from dual union -- CR
select chr(9), '\t' from dual -- HT
),
inp_str as (
select 'No' || chr(12) || 'w is ' || chr(9) || 'the "time" for /all go\od men to '||
chr(8)||'com' || chr(10) || 'e to the aid of their ' || chr(13) || 'country' txt from dual
)
select max(replace(sys_connect_by_path(ch,'`'),'`')) as txt
from (
select lvl
,decode(str_to,null,substr(txt, lvl, 1),str_to) as ch
from inp_str cross join (select level lvl from inp_str connect by level <= length(txt))
left outer join trans_tbl on (ch_frm = substr(txt, lvl, 1))
)
connect by lvl = prior lvl+1
start with lvl = 1;
TXT
------------------------------------------------------------------------------------------
No\fw is \tthe \"time\" for \/all go\\od men to \bcom\ne to the aid of their \rcountry
SQL>
EDIT 8/10/2016 - Make it a function for encapsulation and reusability so you could use it for multiple columns at once:
create or replace function esc_json(string_in varchar2)
return varchar2
is
s_converted varchar2(4000);
BEGIN
with trans_tbl(ch_frm, str_to) as (
select '"', '\"' from dual union
select '/', '\/' from dual union
select '\', '\\' from dual union
select chr(8), '\b' from dual union -- BS
select chr(12), '\f' from dual union -- FF
select chr(10), '\n' from dual union -- NL
select chr(13), '\r' from dual union -- CR
select chr(9), '\t' from dual -- HT
),
inp_str(txt) as (
select string_in from dual
)
select max(replace(sys_connect_by_path(ch,'`'),'`')) as c_text
into s_converted
from (
select lvl
,decode(str_to,null,substr(txt, lvl, 1),str_to) as ch
from inp_str cross join (select level lvl from inp_str connect by level <= length(txt))
left outer join trans_tbl on (ch_frm = substr(txt, lvl, 1))
)
connect by lvl = prior lvl+1
start with lvl = 1;
return s_converted;
end esc_json;
Example to call for multiple columns at once:
select esc_json(column_1), esc_json(column_2)
from your_table;
Inspired by the answer above, I created this simpler "one-liner" function:
create or replace function json_esc (
str IN varchar2
) return varchar2
begin
return REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(str, chr(8), '\b'), chr(9), '\t'), chr(10), '\n'), chr(12), '\f'), chr(13), '\r');
end;
Please note, both this and #Gary_W's answer above are not escaping all control characters as the json.org seems to indicate.
in sql server you can use STRING_ESCAPE() function like below:
SELECT
STRING_ESCAPE('['' This is a special / "message" /'']', 'json') AS
escapedJson;

SQL Concatenate strings across multiple columns with corresponding values

I'm looking for a way to achieve this in a SELECT statement.
FROM
Column1 Column2 Column3
A,B,C 1,2,3 x,y,z
TO
Result
A|1|x,B|2|y,C|3|z
The delimiters don't matter. I'm just trying to to get all the data in one single column. Ideally I am looking to do this in DB2. But I'd like to know if there's an easier way to get this done in Oracle.
Thanks
You can do it like this using INSTR and SUBSTR:
select
substr(column1,1,instr(column1,',',1)-1) || '|' ||
substr(column2,1,instr(column2,',',1)-1) || '|' ||
substr(column3,1,instr(column3,',',1)-1) || '|' ||
',' ||
substr(column1 ,instr(column1 ,',',1,1)+1,instr(column1 ,',',1,2) - instr(column1 ,',',1)-1) || '|' ||
substr(column2 ,instr(column2 ,',',1,1)+1,instr(column2 ,',',1,2) - instr(column2 ,',',1)-1) || '|' ||
substr(column3 ,instr(column3 ,',',1,1)+1,instr(column3 ,',',1,2) - instr(column3 ,',',1)-1) || '|' ||
',' ||
substr(column1 ,instr(column1 ,',',1,2)+1) || '|' ||
substr(column2 ,instr(column2 ,',',1,2)+1) || '|' ||
substr(column3 ,instr(column3 ,',',1,2)+1)
from yourtable
i tried some thing. just look into link
first i created a table called t_ask_test and inserted the data based on the above question. Achieved the result by using the string functions
sample table
create table t_ask_test(column1 varchar(10), column2 varchar(10),column3 varchar(10));
inserted a row
insert into T_ASK_TEST values ('A,B,C','1,2,3','x,y,z');
the following query will be in dynamic way
select substr(column1,1,instr(column1,',',1,1)-1)||'|'||substr(column2,1,instr(column1,',',1,1)-1)||'|'||substr(column3,1,instr(column1,',',1,1)-1) ||','||
substr(column1,instr(column1,',',1,1)+1,instr(column1,',',1,2)-instr(column1,',',1,1)-1)||'|'||substr(column2,instr(column2,',',1,1)+1,instr(column2,',',1,2)-instr(column2,',',1,1)-1)||'|'||substr(column3,instr(column3,',',1,1)+1,instr(column3,',',1,2)-instr(column3,',',1,1)-1) ||','||
substr(column1,instr(column1,',',1,2)+1,length(column1)-instr(column1,',',1,2))||'|'||substr(column2,instr(column2,',',1,2)+1,length(column2)-instr(column2,',',1,2))||'|'||substr(column3,instr(column3,',',1,2)+1,length(column3)-instr(column3,',',1,2)) as test from t_ask_test;
output will be as follows
TEST
---------------
A|1|x,B|2|y,C|3|z
If you have a dynamic number of entries for each row then:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TEST ( Column1, Column2, Column3 ) AS
SELECT 'A,B,C', '1,2,3', 'x,y,z' FROM DUAL
UNION ALL SELECT 'D,E', '4,5', 'v,w' FROM DUAL;
Query 1:
WITH ids AS (
SELECT t.*, ROWNUM AS id
FROM TEST t
)
SELECT LISTAGG(
REGEXP_SUBSTR( i.Column1, '[^,]+', 1, n.COLUMN_VALUE )
|| '|' || REGEXP_SUBSTR( i.Column2, '[^,]+', 1, n.COLUMN_VALUE )
|| '|' || REGEXP_SUBSTR( i.Column3, '[^,]+', 1, n.COLUMN_VALUE )
, ','
) WITHIN GROUP ( ORDER BY n.COLUMN_VALUE ) AS value
FROM ids i,
TABLE(
CAST(
MULTISET(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= GREATEST(
REGEXP_COUNT( i.COLUMN1, '[^,]+' ),
REGEXP_COUNT( i.COLUMN2, '[^,]+' ),
REGEXP_COUNT( i.COLUMN3, '[^,]+' )
)
)
AS SYS.ODCINUMBERLIST
)
) n
GROUP BY i.ID
Results:
| VALUE |
|-------------------|
| A|1|x,B|2|y,C|3|z |
| D|4|v,E|5|w |
You need to use:
SUBSTR
INSTR
|| concatenation operator
It would be easy if you break your output, and then understand how it works.
SQL> WITH t AS
2 ( SELECT 'A,B,C' Column1, '1,2,3' Column2, 'x,y,z' Column3 FROM dual
3 )
4 SELECT SUBSTR(column1, 1, instr(column1, ',', 1) -1)
5 ||'|'
6 || SUBSTR(column2, 1, instr(column2, ',', 1) -1)
7 ||'|'
8 || SUBSTR(column3, 1, instr(column1, ',', 1) -1)
9 ||','
10 || SUBSTR(column1, instr(column1, ',', 1, 2) +1 - instr(column1, ',', 1),
11 instr(column1, ',', 1) -1)
12 ||'|'
13 || SUBSTR(column2, instr(column2, ',', 1, 2) +1 - instr(column2, ',', 1),
14 instr(column2, ',', 1) -1)
15 ||'|'
16 || SUBSTR(column3, instr(column3, ',', 1, 2) +1 - instr(column3, ',', 1),
17 instr(column3, ',', 1) -1)
18 ||','
19 || SUBSTR(column1, instr(column1, ',', 1, 3) +1 - instr(column1, ',', 1),
20 instr(column1, ',', 2) -1)
21 as "new_column"
22 FROM t;
new_column
-------------
A|1|x,B|2|y,C
On a side note, you should avoid storing delimited values in a single column. Consider normalizing the data.
From Oracle 11g and above, you could create a VIRTUAL COLUMN using the above expression and use it instead of executing the SQL frequently.
Its very simple in oracle. just use the concatenation operatort ||.
In the below solution, I have used underscore as the delimiter
select Column1 ||'_'||Column2||'_'||Column3 from table_name;