Oracle regex to extract string between first pair of < and > brackets - sql

I am have been assigned a task to parse a string (which is essentially in XML format) and I need to extract the name of the first tag in the string
eg: string '<column><data-type>string</data-type>.............'
or '<filter><condition>....</condition>...............'
or
'......................'
the string keeps changing but I am only interested in the first tag, I would like to get the output like:
column,
filter,
query
i have tried regexp_substr(string,'^<(.+)>',1,1,null,1) and some similer variations but they don't seem to be working cosistently.
Please help.

If you have XML data then use a proper XML parser:
SELECT XMLQUERY( '/*/name()' PASSING XMLTYPE(value) RETURNING CONTENT ) AS tag_name
FROM table_name
Which for the sample data:
CREATE TABLE table_name ( value CLOB );
INSERT INTO table_name ( value )
SELECT '<column><data-type>string</data-type></column>' FROM DUAL UNION ALL
SELECT '<filter><condition>....</condition></filter>' FROM DUAL UNION ALL
SELECT '<query />' FROM DUAL UNION ALL
SELECT '<has_attributes attr1="do not return this" attr2="<or> this" />' FROM DUAL
Outputs:
| TAG_NAME |
| :------------- |
| column |
| filter |
| query |
| has_attributes |
db<>fiddle here

You are looking for any character between the bounds -- and that includes '>'. So, just exclude the terminating character:
select regexp_substr(string,'^<([^>]+)>',1,1,null,1)
from (select '<column><data-type>string</data-type>.............' as string from dual union all
select '<filter><condition>....</condition>...............' from dual
) x;

Related

Equivalent function in HANA DB for json_object

I would like to return the query results into json format in HANA DB.
There is a json_object function in oracle to achieve this requirement, but I am not seeing any function in HANA.
Does anyone knows if this kind of function exists in HANA
For example:
Table Author contains non-json data as follows:
---------------------------------------------
| firstName | lastName |
---------------------------------------------
| Paulo | Coelho |
| George | Orwell |
---------------------------------------------
write a select statement to return result as json.
In Oracle it can be returned using query:
SELECT json_object(
KEY 'firstName' VALUE author.first_name,
KEY 'lastName' VALUE author.last_name
)
FROM author
Output looks like this:
---------------------------------------------
| json_array |
---------------------------------------------
| {"firstName":"Paulo","lastName":"Coelho"} |
| {"firstName":"George","lastName":"Orwell"} |
----------------------------------------------
Does anyone knows query or function in HANA to achieve the same result?
you can use the already mentioned function in SAP HANA too
JSON_QUERY (
<JSON_API_common_syntax>
[ <JSON_output_clause> ]
[ <JSON_query_wrapper_behavior> ]
[ <JSON_query_empty_behavior> ON EMPTY ]
[ <JSON_query_error_behavior> ON ERROR ]
)
research
For 2.0 SP04 and above there's a for json addition to the select statement. As documentation says, it is only permitted in subqueries, so you need to select individual columns in subselect (if you need a result set of JSON objects) of generate a JSON array as a single scalar result. Column names are inherited from subquery aliases.
Case 1:
with a as (
select 'AAA' as field1, 'Value 1' as val from dummy union all
select 'BBB' as field1, 'Value 2' as val from dummy
)
select
/*Use correlated subquery with single row*/
json_value((select a.field1, a.val from dummy for json), '$[0]') as res
from a
Or more effort to type-in, but less structure-dependent:
with a as (
select 'AAA' as field1, 'Value 1' as val from dummy union all
select 'BBB' as field1, 'Value 2' as val from dummy
)
, json_source as (
/*Intermediate query to use as correlation source in JSON_TABLE*/
select (select * from a for json) as tmp_json
from dummy
)
select json_parsed.*
from json_source,
json_table(
json_source.tmp_json
/*Access individual items*/
, '$[*]'
columns (
res nvarchar(1000) format json path '$'
)
) as json_parsed
Both return:
RES
{"FIELD1":"AAA","VAL":"Value 1"}
{"FIELD1":"BBB","VAL":"Value 2"}
Or as a scalar query returning JSON array (Case 2):
with a as (
select 'AAA' as field1, 'Value 1' as val from dummy union all
select 'BBB' as field1, 'Value 2' as val from dummy
)
select *
from (select * from a for json)
JSONRESULT
[{"FIELD1":"AAA","VAL":"Value 1"},{"FIELD1":"BBB","VAL":"Value 2"}]

oracle sql contain

I have table with values in columns like
colname
TMC_MCH,OTA_MCH,CONSOL_MCH,RETAIL_MCH,TOUROP_MCH,SPEC_MCH,QRACTO_MCH
RETAIL_MCH
RETAIL_MCH,CONSOL_MCH
CONSOL_MCH
OTA_MCH
I need to run query to fetch all rows contains RETAIL_MCH or CONSOL_MCH.
if i run query below i get result as below
select * from table111 where
(CONTAINS(table111.colname, 'RETAIL,CONSOL' , 1) > 0)
TMC_MCH,OTA_MCH,CONSOL_MCH,RETAIL_MCH,TOUROP_MCH,SPEC_MCH,QRACTO_MCH
RETAIL_MCH
RETAIL_MCH,CONSOL_MCH
CONSOL_MCH
but I need to exact search including underscore "_"
select * from table111 where
(CONTAINS(table111.colname, 'RETAIL_MCH,CONSOL_MCH' , 1) > 0)
CONTAINS is an Oracle text function, you can escape the underscore:
SELECT *
FROM table111
WHERE CONTAINS( colname, 'RETAIL\_MCH,CONSOL\_MCH', 1 ) > 0
Or, if you want to pass the string in unescaped then you could use REPLACE to add the escape characters:
SELECT *
FROM table111
WHERE CONTAINS( colname, REPLACE ( 'RETAIL_MCH,CONSOL_MCH', '_', '\_' ), 1 ) > 0
Which, for the sample data:
CREATE TABLE table111 ( colname ) AS
SELECT 'TMC_MCH,OTA_MCH,CONSOL_MCH,RETAIL_MCH,TOUROP_MCH,SPEC_MCH,QRACTO_MCH' FROM DUAL UNION ALL
SELECT 'RETAIL_MCH' FROM DUAL UNION ALL
SELECT 'RETAIL_MCH,CONSOL_MCH' FROM DUAL UNION ALL
SELECT 'CONSOL_MCH' FROM DUAL UNION ALL
SELECT 'OTA_MCH' FROM DUAL;
CREATE INDEX table111__colname__textidx ON table111(colname) INDEXTYPE IS CTXSYS.CONTEXT;
Outputs:
| COLNAME |
| :------------------------------------------------------------------- |
| TMC_MCH,OTA_MCH,CONSOL_MCH,RETAIL_MCH,TOUROP_MCH,SPEC_MCH,QRACTO_MCH |
| RETAIL_MCH |
| RETAIL_MCH,CONSOL_MCH |
| CONSOL_MCH |
db<>fiddle here
It would be probably simpler if you said which result you want. The way I understood it, maybe this helps:
select *
from table111
where instr(colname, 'RETAIL_MCH') > 0
or instr(colname, 'CONSOL_MCH') > 0;
OR might need to be substituted by AND (depending on what you want).
If you want either value, you would use:
where CONTAINS(table111.colname, 'RETAIL|CONSOL' , 1) > 0
If you want both:
where CONTAINS(table111.colname, 'RETAIL&CONSOL' , 1) > 0
You should pass the value in with the operator you want, instead of ,.

Set plsql parameters partially

I need to create a query that when i enter partially the value of a parameter until"/" is shown it displays all the row sequences with that parameter in the first part before "/"
I enter as a parameter: pwd
The result of query should be:
pwd/1
pwd/2
pwd/3....
If you have a bind variable :parameter then:
SELECT *
FROM table_name
WHERE value LIKE :parameter || '/%'
So for some test data:
CREATE TABLE table_name ( id, value ) AS
SELECT 1, 'pwd/1' FROM DUAL UNION ALL
SELECT 2, 'pwd/2' FROM DUAL UNION ALL
SELECT 3, 'pwdtest/1' FROM DUAL;
If :parameter is pwd then the output is:
ID | VALUE
-: | :----
1 | pwd/1
2 | pwd/2
db<>fiddle here

regexp_substr with LIKE as search condition

Thank you mathguy for your suggestion and assistance. The example you provided is a near perfect description of the issue. That being said I've used and edited your text to help describe this issue:
I receive a string that contains comma delimited digits in the form of 18656, 16380, 16424 (call this param1). The string only contains commas and digits.
In mytable I have a column named t with values such as 18656.01.02, 10.02.02, 16380.02.03, 16424.05.66, 16424.55.23.14.
I want to select the all rows that match all of the comma-separated digits in param1; where the first numeric component in column t is like 18656, 16380, 16424. Is there a way to use regexp_substr in this case.
Where param1 = 18656, 16380, 16424
the following works:
select * from mytable where t.mycolumn IN
(
(SELECT regexp_substr(:param1,'[^,]+', 1, level) as NUMLIST
FROM DUAL
CONNECT BY regexp_substr(:param1, '[^,]+', 1, level) IS NOT NULL)
);
How to use wildcard if data I seek from t.mycolumn = 18656.00.01, 16380.09.34, 16424.023.8
Can LIKE be used as search criteria? If possible please provide example.
Obviously, the following will not work but I am hoping to find a solution.
select * from mytable where t.mycolumn LIKE
(
(SELECT regexp_substr(:param1||'%','[^,]+', 1, level) as NUMLIST
FROM DUAL
CONNECT BY regexp_substr(:param1||'%', '[^,]+', 1, level) IS NOT NULL)
);
Assumptions:
There is a table named mytable with a column named t which
contains values as follows:
SELECT * FROM mytable;
T |
---------------|
18656.01.02 |
10.02.02 |
16380.02.03 |
16424.05.66 |
16424.55.23.14 |
There is a string received as a parameter, that contains comma delimited digits in the form of 18656, 16380, 16424. The string only contains commas and digits. This string is parsed into indyvidual rows with a help of a query that looks similar to the folowing one:
SELECT regexp_substr(param1,'[^,]+', 1, level) as NUMLIST
FROM (
select '18656,16380,16424' as param1 FROM DUAL
)
CONNECT BY regexp_substr(param1, '[^,]+', 1, level) IS NOT NULL
;
NUMLIST |
--------|
18656 |
16380 |
16424 |
Requirement
Can LIKE be used as search criteria? If possible please provide
example.
LIKE keyword is used below as a condition in JOIN ... ON clause:
SELECT * FROM mytable
WHERE t IN (
SELECT t
FROM mytable m
JOIN (
SELECT regexp_substr(param1,'[^,]+', 1, level) as NUMLIST
FROM (
select '18656,16380,16424' as param1 FROM DUAL
)
CONNECT BY regexp_substr(param1, '[^,]+', 1, level) IS NOT NULL
) x
ON m.t LIKE '%' || x.NUMLIST || '%'
)
T |
---------------|
18656.01.02 |
16380.02.03 |
16424.05.66 |
16424.55.23.14 |

Split by uppercase Oracle

I am loooking for a regex expression or something that from this :
------------------------
| id | prop_name |
------------------------
| 1 | isThisAnExample |
------------------------
To this :
-----------------------------
| id | prop_name |
-----------------------------
| 1 | Is This An Example |
-----------------------------
Of course it would be cool if the first character is uppercase and also if the other words start with lowercase. But only spliting them also will be okay.
Maybe this is the regexp you are looking for
"Insert a blank between each lower case character followed by an upper case character":
select regexp_replace('IsThisAnExample', '([[:lower:]])([[:upper:]])', '\1 \2') from dual
First character can simply replaced by an upper case letter by
select upper(substr('isThisAn Example', 1,1))||substr('isThisAn Example', 2) from dual;
So, first replace the first character and regexp_replace for the result:
select regexp_replace(upper(substr('isThisAn Example', 1,1))||substr('isThisAn Example', 2), '([[:lower:]])([[:upper:]])', '\1 \2') from dual;
If only the first character of your sentence should be an upper case letter, then try:
select upper(substr(regexp_replace('IsThisAnExample', '([[:lower:]])([[:upper:]])', '\1 \2'),1,1))||
lower(substr(regexp_replace('IsThisAnExample', '([[:lower:]])([[:upper:]])', '\1 \2'),2))
from dual
Better use regex, but anyway:
SELECT listagg(splitted, '') within GROUP (ORDER BY lvl) FROM(
SELECT LEVEL lvl, CASE WHEN SUBSTR(your_string, LEVEL, 1) =
UPPER(SUBSTR(your_string, LEVEL, 1))
THEN ' ' || SUBSTR(your_string, LEVEL, 1) ELSE
SUBSTR(your_string, LEVEL, 1) END splitted
FROM (SELECT 'isThisAnExample' your_string FROM dual)
CONNECT BY LEVEL <= LENGTH(your_string) );
Similar to Frank's solution, but simpler (reducing the use of regular expressions as much as possible):
with
input ( str ) as (
select 'isThisAnExample' from dual
)
select upper(substr(str, 1, 1)) ||
lower(regexp_replace(substr(str, 2), '(^|[[:lower:]])([[:upper:]])', '\1 \2'))
as modified_str
from input;
MODIFIED_STR
------------------
Is this an example
1 row selected.