Convert Column XML data to Rows - sql

I am using version PostgreSQL 8.3.12.
Table Column Name : xmltest
id data
Data Column contains XML data that looks like this:
<?xml version="1.0" encoding="utf-8"?>
<ItemData>
<ReviewComments><?xml version="1.0" encoding="utf-8"?>
</ReviewComments>
<SmartTextData RootProtect="True">
<STI_Summary IID="d10c5cbf-f5cf-4478-9f33-4580c1930413" IR="True">
<ObjectValue />
<CP />
<SIProps ST="4" PL="False" PLS="" />
<STI_SummaryActiveProblemsField LIID="cdbd7044-ccde-11db-8cba-df0a56d89593" IID="37742a5f-7998-4715-8d43-0d7a19284d44" IR="True" RW="1">
<HD Title="Active Problems" />
<ObjectValue />
<CP>
<PosReplace />
<NegReplace />
</CP>
<SIProps ST="4" PL="False" PLS="" />
<STI_US>
<ObjectValue>
<TextValue>
<![CDATA[
]]>
</TextValue>
</ObjectValue>
<CP />
<SIProps ST="1" SS=" " PL="False" PLS="" />
</STI_US>
<STI_DxItem LIID="71194038-8ffb-488b-8af5-5f1f1a679115" IID="aaf2de4e-2f1f-409b-87b7-b7265bec37db" RW="1">
<HD Title="Coronary artery disease " />
<ObjectValue>
<Code>
<CodingSystem>ICD-9 CM</CodingSystem>
<Value>414.01</Value>
</Code>
<Code>
<CodingSystem>SWICPC</CodingSystem>
<Value>08.0.K76.CIR</Value>
</Code>
</ObjectValue>
</STI_DxItem >
</STI_Summary>
</SmartTextData>
</ItemData>
I want to spilt XML Tag IID and CODE data to respective ID Column.
Expected Output :
ID LIID Code_Value CodingSystem
1 d10c5cbf-f5cf-4478-9f33-4580c1930413 NULL
1 37742a5f-7998-4715-8d43-0d7a19284d44 NULL
1 aaf2de4e-2f1f-409b-87b7-b7265bec37db 414.01 IC CM
1 aaf2de4e-2f1f-409b-87b7-b7265bec37db 08.0.K76.CIR SWICPC
Note : I am using version PostgreSQL 8.3.12 with this some new syntax of XMLPATH not work.
Simply I want to convert XML Data to rows column structure.
Thanks for Reading this.

Looking at SO questions, I've found this:
ERROR: function unnest(integer[]) does not exist in postgresql
According to the solution of this question, you can implement unnest by your own, on a one-dimension arrays in this way:
CREATE OR REPLACE FUNCTION unnest2(anyarray)
RETURNS SETOF anyelement AS
$BODY$
SELECT $1[i] FROM generate_series(array_lower($1,1), array_upper($1,1)) i;
$BODY$ LANGUAGE sql IMMUTABLE;
This function should allow to query your data using next sentence:
SELECT xpath('/ItemData/SmartTextData/STI_Summary/STI_DxItem/#IID', data, array[array['aaa','example.com']]) as IID,
unnest2(xpath('/ItemData/SmartTextData/STI_Summary/ObjectValue/Code/CodingSystem/text()', data, array[array['aaa','example.com']])) as Code,
unnest2(xpath('/ItemData/SmartTextData/STI_Summary/ObjectValue/Code/Value/text()', data, array[array['aaa','example.com']])) as Value
from xmltest2;
Notice, I've used array[array['aaa','example.com']] as you pointed out in your comments, due your XML data has not schema.
I've tested it in rextester and it works.
+--------------------------------------+----------+--------------+
| iid | code | value |
+--------------------------------------+----------+--------------+
| aaf2de4e-2f1f-409b-87b7-b7265bec37db | ICD-9 CM | 414.01 |
+--------------------------------------+----------+--------------+
| aaf2de4e-2f1f-409b-87b7-b7265bec37db | SWICPC | 08.0.K76.CIR |
+--------------------------------------+----------+--------------+
Check it here: Rextester

Related

Select 2nd row in XML Column in database using SQL

Having trouble selecting a specific info from an XML Format in a column of a table in the database. I need to pull the Success message for ModuleID 959
SubmissionID
ModuleID
CreatedOn
XMLCOL
UpdatedOn
25
959
1-1-22
"see XML below"
1-1-22
26
339
2-1-22
Null
2-1-22
Below is the data inside the XML column within the database - what I want to achieve is to show the 2nd ResultType "success" in the query with SQL.
<ArrayOfActionResult xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ActionResult>
<ResultType>Redirected to Payment</ResultType>
<ActionName>Payment</ActionName>
<ExecutionTime></ExecutionTime>
<ConditionSet>
<Conditions />
<ExecuteCondition>Always</ExecuteCondition>
<MatchCondition>All</MatchCondition>
<ExecuteStatus>0</ExecuteStatus>
<Groups />
</ConditionSet>
<ConditionsMet>true</ConditionsMet>
<Condition />
</ActionResult>
<ActionResult>
<ResultType>Success</ResultType>
<ActionName>Payment</ActionName>
<ExecutionTime></ExecutionTime>
<ConditionSet>
<Conditions />
<ExecuteCondition>Always</ExecuteCondition>
<MatchCondition>All</MatchCondition>
<ExecuteStatus>0</ExecuteStatus>
<Groups />
</ConditionSet>
<ConditionsMet>true</ConditionsMet>
</ActionResult>
</ArrayOfActionResult>
Currently I'm trying to use the SQL below to no avail
SELECT [XMLCOL].value('/ArrayOfActionResult/ActionResult/ResultType[2]') as PaymentMessage
FROM Databasetable
where [ModuleID] = 959
Hopefully this makes sense, I found it quite difficult to explain, I am very new to SQL
Check it out below.
Assuming your db is MS SQL Server.
The XQuery .value() method has two mandatory parameters.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ModuleID INT PRIMARY KEY, XMLCOL XML);
INSERT INTO #tbl (ModuleID, XMLCOL) VALUES
(959, N'<ArrayOfActionResult xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ActionResult>
<ResultType>Redirected to Payment</ResultType>
<ActionName>Payment</ActionName>
<ExecutionTime></ExecutionTime>
<ConditionSet>
<Conditions/>
<ExecuteCondition>Always</ExecuteCondition>
<MatchCondition>All</MatchCondition>
<ExecuteStatus>0</ExecuteStatus>
<Groups/>
</ConditionSet>
<ConditionsMet>true</ConditionsMet>
<Condition/>
</ActionResult>
<ActionResult>
<ResultType>Success</ResultType>
<ActionName>Payment</ActionName>
<ExecutionTime></ExecutionTime>
<ConditionSet>
<Conditions/>
<ExecuteCondition>Always</ExecuteCondition>
<MatchCondition>All</MatchCondition>
<ExecuteStatus>0</ExecuteStatus>
<Groups/>
</ConditionSet>
<ConditionsMet>true</ConditionsMet>
</ActionResult>
</ArrayOfActionResult>');
-- DDL and sample data population, end
SELECT ModuleID
, XMLCOL.value('(/ArrayOfActionResult/ActionResult[2]/ResultType/text())[1]','VARCHAR(30)') as PaymentMessage
FROM #tbl
WHERE ModuleID = 959;
Output
+----------+----------------+
| ModuleID | PaymentMessage |
+----------+----------------+
| 959 | Success |
+----------+----------------+

How to select values between a XML tag in SQL Query

I have a table with CLOB column storing a XML. The structure of XML is unreadable. I want to get values between few tags like <DOMAINID>; sample is shown below.
XML:
<ID>
<DOMAIN>IND<DOMAIN>
<DOMAINID>112AC<DOMAINID>
<ID>
<GROUP>
<GP>ASIA<GP>
<RSN>GOOD<RSN>
<GROUP>
I am using this:
SELECT REGEXP_REPLACE(COL,'^.*<DOMAINID>(.*)</DOMAINID>.*$','\1',1,0,'mn') col1 FROM tab;
Expected result:
112AC
Actual XML:
<?xml version="1.0" encoding="US-ASCII"?>
<GML:GMMessage
xmlns:GML="GML"
xmlns:GMLType="GML.Type"
xsi:schemaLocation="GML ../schema/gml..xsd" SchemaVersion="9.8">
<BusinessHdr>
<busHdr:BusObjectType>ABC</busHdr:BusObjectType>
<busHdr:BusObjectOwner>HDHDH</busHdr:BusObjectOwner>
<busHdr:BusObjectId>DJHDAHDAJHDA</busHdr:BusObjectId>
<busHdr:BusObjectVersion>1</busHdr:BusObjectVersion>
</BusinessHdr>
<Transaction>
<GenericEvent>NEW</GenericEvent>
<Group>
<GroupId>3424234</GroupId>
<Reason>MANUAL</Reason>
</Group>
< xsi:type="mm:MMIam">
<Id>
<Domain>ssdsgdsg</Domain>
<DomainId>123456ACC</DomainId>
<Version>1</Version>
</Id>
<Date>2021-02-01</Date>
</Transaction>
</GML:GMMessage>
Do not use regular expressions to parse XML; use a proper XML parser.
However, what you have is not properly formed XML as it is missing a root element and you are missing the / in all of the closing tags; so you first need to fix your XML and give it a root element and then you can parse it using an XML parser.
SELECT x.*
FROM table_name t
CROSS APPLY XMLTABLE(
'//root'
PASSING XMLTYPE( '<root>' || t.data || '</root>' )
COLUMNS
domain VARCHAR2(10) PATH './ID/DOMAIN',
domainid VARCHAR2(10) PATH './ID/DOMAINID',
gp VARCHAR2(50) PATH './GROUP/GP',
rsn VARCHAR2(50) PATH './GROUP/RSN'
) x
Which, for the sample data:
CREATE TABLE table_name ( data ) AS
SELECT '<ID>
<DOMAIN>IND</DOMAIN>
<DOMAINID>112AC</DOMAINID>
</ID>
<GROUP>
<GP>ASIA</GP>
<RSN>GOOD</RSN>
</GROUP>' FROM DUAL
Outputs:
DOMAIN | DOMAINID | GP | RSN
:----- | :------- | :--- | :---
IND | 112AC | ASIA | GOOD
If you just want a single value then you can use XMLQUERY:
SELECT XMLQUERY(
'/root/ID/DOMAINID/text()'
PASSING XMLTYPE( '<root>'||data||'</root>' )
RETURNING CONTENT
) AS domainid
FROM table_name
Which outputs:
| DOMAINID |
| :------- |
| 112AC |
db<>fiddle here
Update
I am going to assume that your XML also defines the xsi and busHdr namespaces (if it doesn't then Oracle will fail to parse the XML as it does not know what those namespaces are); that would give you this sample data:
CREATE TABLE table_name ( data ) AS
SELECT '<?xml version="1.0" encoding="US-ASCII"?>
<GML:GMMessage
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:GML="GML"
xmlns:busHdr="busHdr"
xmlns:GMLType="GML.Type"
xsi:schemaLocation="GML ../schema/gml.xsd busHdr ../schema/bushdr.xsd"
SchemaVersion="9.8">
<BusinessHdr>
<busHdr:BusObjectType>ABC</busHdr:BusObjectType>
<busHdr:BusObjectOwner>HDHDH</busHdr:BusObjectOwner>
<busHdr:BusObjectId>DJHDAHDAJHDA</busHdr:BusObjectId>
<busHdr:BusObjectVersion>1</busHdr:BusObjectVersion>
</BusinessHdr>
<Transaction>
<GenericEvent>NEW</GenericEvent>
<Group>
<GroupId>3424234</GroupId>
<Reason>MANUAL</Reason>
</Group>
<Id>
<Domain>ssdsgdsg</Domain>
<DomainId>123456ACC</DomainId>
<Version>1</Version>
</Id>
<Date>2021-02-01</Date>
</Transaction>
</GML:GMMessage>' FROM DUAL
Then, you just need to add the namespace that you are using and update the paths to the new (case-sensitive) locations:
SELECT x.*
FROM table_name t
CROSS APPLY XMLTABLE(
XMLNAMESPACES( 'GML' AS "GML" ),
'//GML:GMMessage/Transaction'
PASSING XMLTYPE( t.data )
COLUMNS
domain VARCHAR2(10) PATH './Id/Domain',
domainid VARCHAR2(10) PATH './Id/DomainId',
version NUMBER(3,0) PATH './Id/Version',
groupid VARCHAR2(50) PATH './Group/GroupId',
reason VARCHAR2(50) PATH './Group/Reason',
dt DATE PATH './Date'
) x
Outputs:
DOMAIN | DOMAINID | VERSION | GROUPID | REASON | DT
:------- | :-------- | ------: | :------ | :----- | :--------
ssdsgdsg | 123456ACC | 1 | 3424234 | MANUAL | 01-FEB-21
db<>fiddle here
Good to see your thinking with your approach...
Would suggest checking out this tool (if you haven't got a similar one) to help you with Regular expressions https://regexr.com/, helps me a lot.
You're SQL looks right (using the "m" and "n" flag for multiline), but not sure if it's your XML was typed in wrong, since you're regex string doesn't work on the XML you pasted, but I did get it work if it's XML.
What is you're current output from your SQL? you might need to use $1 in place of your \1.
I would also suggest
perhaps also escaping your forward slash, as that might be your culprit.
add more specificity to your capture to stop your search from being greedy.
SELECT REGEXP_REPLACE(COL,'^.*<DOMAINID>([0-9A-z]+)<\/DOMAINID>.*$','$1',1,0,'mn') col1 FROM tab;

Extracting multiple values from BLOB as XML

I have an XML like this in a BLOB column:
<?xml version="1.0" encoding="UTF-8"?>
<document xmlns="urn:xyzns">
<history>
<Event>
<year>1983</year>
<Country>
<Location>Lisbon</Location>
<Type>Political</Type>
</Country>
</Event>
<Event>
<Country>
<Location>USA</Location>
<Type>Entertainment</Type>
<year>2016</year>
</Country>
</Event>
</history>
</document>
As you can see the year can be either in the event level or at country level. There can be multiple events and multiple countries per event. This whole XML is stored in a BLOB column in Oracle. I need to extract the value of the year or better check if the year is 2000 and if so return the primary key of the row.
I used EXISTSNODE to check if the year tag is present.
select pk from table where XMLType(blobdata, nls_charset_id('UTF8')).EXISTSNODE('/Document/history/Event[*]/year',
'xmlns="urn:xyzns"') = 1 and EXTRACTVALUE(XMLTYPE(blobdata, nls_charset_id('UTF8')), '/Document/history/Event[*]/year/text()',
'xmlns="urn:xyzns"') = '2000';
However this fails and the extractvalue query returns multiple nodes, so I changed the parameter to '/Document/history/Event[1]/year/text()' to check and it works. However this wouldnt be enough as it only checks the first event tag.
I looked at other questions here and one of the options was to use XMLTABLE since extractvalue is deprecated. I am having trouble understanding the parameters given inside the XMLTABLE. Could someone explain how to use XMLTABLE in this scenario? I should point out that the original datatype is BLOB and not CLOB. Thank you.
Use XMLTABLE to get values for both locations and then use COALESCE to show whichever is not NULL:
SELECT COALESCE( year, country_year ) AS year,
location,
type
FROM table_name t
CROSS APPLY XMLTABLE(
XMLNAMESPACES( DEFAULT 'urn:xyzns' ),
'/document/history/Event'
PASSING XMLTYPE(t.blobdata, nls_charset_id('UTF8'))
COLUMNS
year NUMBER(4,0) PATH './year',
country_year NUMBER(4,0) PATH './Country/year',
location VARCHAR2(200) PATH './Country/Location',
type VARCHAR2(200) PATH './Country/Type'
) x
Which, for the sample data:
CREATE TABLE table_name ( blobdata BLOB );
INSERT INTO table_name
VALUES (
UTL_RAW.CAST_TO_RAW(
'<?xml version="1.0" encoding="UTF-8"?>
<document xmlns="urn:xyzns">
<history>
<Event>
<year>1983</year>
<Country>
<Location>Lisbon</Location>
<Type>Political</Type>
</Country>
</Event>
<Event>
<Country>
<Location>USA</Location>
<Type>Entertainment</Type>
<year>2016</year>
</Country>
</Event>
</history>
</document>'
)
);
Outputs:
YEAR | LOCATION | TYPE
---: | :------- | :------------
1983 | Lisbon | Political
2016 | USA | Entertainment
db<>fiddle here

SQL Server Where Clause Path for XML Value

Main problem is how to use a where clause in SQL for XML.
I need to return the row of XML where the "team" appears in.
You can view my example fiddle here: http://sqlfiddle.com/#!18/de221e/2
Table:
CREATE TABLE test
(
id int,
value xml
)
INSERT INTO test (id, value)
VALUES
('1', N'<?xml version="1.0" encoding="utf-16" standalone="yes"?> <Atts>
<Att>
<Name>test</Name>
</Att>
<Att>
<Name>team</Name>
</Att>
<Att>
<Name>test</Name>
</Att>
</Atts>'),
('2', N'<?xml version="1.0" encoding="utf-16" standalone="yes"?> <Atts>
<Att>
<Name>test</Name>
</Att>
<Att>
<Name>test</Name>
</Att>
<Att>
<Name>test</Name>
</Att>
</Atts>');
query:
select * from test
where value.value('(/Atts/Att/Name)[1]','varchar(max)') = 'team'
This doesn't return anything.
However if you do where clause on first name that appears in the XML it works e.g.
select * from test
where value.value('(/Atts/Att/Name)[1]','varchar(max)') = 'test'
returns:
| id | value |
|----|---------------------------------------------------------------------------------------------------|
| 1 | <Atts><Att><Name>test</Name></Att><Att><Name>team</Name></Att><Att><Name>test</Name></Att></Atts> |
| 2 | <Atts><Att><Name>test</Name></Att><Att><Name>test</Name></Att><Att><Name>test</Name></Att></Atts> |
Expected results is this query should return:
select * from test
where value.value('(/Atts/Att/Name)[1]','varchar(max)') = 'team'
| id | value |
|----|---------------------------------------------------------------------------------------------------|
| 1 | <Atts><Att><Name>test</Name></Att><Att><Name>team</Name></Att><Att><Name>test</Name></Att></Atts> |
Any ideas how I can return "team" if it appears in XML but isn't in the first
It is better to use exist() method. It will check for the 'team' value regardless of its position. exist() Method (xml Data Type)
SQL
select * from test
where value.exist('/Atts/Att/Name[./text()="team"]') = 1;
It is the second one. You are looking at the first one.
This gives you the results you want
select * from test
where value.value('(/Atts/Att/Name)[2]','varchar(max)') = 'team'

How to pull XML key "value" from SQL CLOB

I am attempting to extract information from XML stored in a CLOB column. I've searched the forums and thus far have been unable to get the data to pull as needed. I have a basic understanding of SQL but this is beyond me.
The XML is similar to the following:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Header>
<OrderNum value="12354321"/>
<ExtractDate value="11-30-2012"/>
<RType value="Status"/>
<Company value="Company"/>
</Header>
<Body>
<Status>
<Order>
<ActivityType value="ValidateRequest"/>
<EndUser>
<Name value="Schmo, Joe"/>
<Address>
<SANO value="12345"/>
<SASN value="Mickey Mouse"/>
<SATH value="Lane"/>
<SASS value="N"/>
<City value="Orlando"/>
<State value="FL"/>
<Zip value="34786"/>
<Number value="5550000"/>
</Address>
</EndUser>
<COS value="1"/>
<TOS value="3"/>
<MainNumber value="5550000"/>
</Order>
<ErrorCode value="400"/>
<ErrorMessage value="RECEIVED"/>
</Status>
</Body>
</Response>
I want to get the values under "Address".
I've tried the following but it returns "NULL".
SELECT EXTRACTVALUE(XMLTYPE(RESPONSE_CLOB),'/Response/Body/Status/Order/EndUser/Address/SANO') AS SANO
FROM RESPONSE_TABLE
WHERE ROWNUM < 2
I am trying to get it so I can pull the "12345" assigned as "value" in "SANO" (ultimately getting the value for other fields, but want to at least get the one working first).
You're currently retrieving the text value of the node, but 12345 is the value attribute of the element rather than its text content. So you would need to use the #attribute syntax, i.e.:
SELECT EXTRACTVALUE(XMLTYPE(RESPONSE_CLOB),'/Response/Body/Status/Order/EndUser/Address/SANO/#value') AS SANO
FROM RESPONSE_TABLE
WHERE ROWNUM < 2;
SANO
--------------------
12345
But extractvalue is deprecated; assuming you're on a recent version of Oracle it would be better to use an XMLQuery:
SELECT XMLQUERY(
'/Response/Body/Status/Order/EndUser/Address/SANO/#value'
PASSING XMLTYPE(RESPONSE_CLOB)
RETURNING CONTENT
) AS SANO
FROM RESPONSE_TABLE
WHERE ROWNUM < 2;
You may find it even easier to use an XMLTable - necessary if an XML document has multiple Address nodes, but even with just one pulling the values out as columns is less repetitive, and it makes it easier to retrieve suitable data types:
select x.*
from response_table rt
cross join xmltable(
'/Response/Body/Status/Order/EndUser/Address'
passing xmltype(rt.response_clob)
columns sano number path 'SANO/#value',
sasn varchar2(30) path 'SASN/#value',
sath varchar2(10) path 'SATH/#value'
-- etc.
) x
where rownum < 2;
SANO SASN SATH
-------------------- ------------------------------ ----------
12345 Mickey Mouse Lane
Read more about using these functions to query XML data.