XQuery and Node Ids - sql

I have this variable:
declare #xmlDoc XML
it has the following xml stored in it:
<?xml version="1.0" encoding="utf-8"?>
<NewDataSet>
<Table1>
<Sharedparam>shared</Sharedparam>
<Antoher>sahre</Antoher>
<RandomParam2>Good stuff</RandomParam2>
<MoreParam>and more</MoreParam>
<ResultsParam>2</ResultsParam>
</Table1>
<Table1>
<RandomParam2>do you</RandomParam2>
<MoreParam>think</MoreParam>
<ResultsParam>2</ResultsParam>
</Table1>
<Table1>
<Sharedparam>Last</Sharedparam>
<Antoher> Set </Antoher>
<RandomParam2> of </RandomParam2>
<MoreParam>values</MoreParam>
<ResultsParam>are here</ResultsParam>
</Table1>
<Table1 />
</NewDataSet>
I have this query that I am using to get the data:
declare #xmlDoc XML
set #xmlDoc = '' -- Stack Overflow could not handle the xml all on one line.
SELECT -- Param 1
TBL.SParam.value('local-name((*)[1])', 'varchar(50)') as Param1Name,
TBL.SParam.value('(*)[1]', 'varchar(100)') as Param1Value,
-- Param2
TBL.SParam.value('local-name((*)[2])', 'varchar(50)') as Param2Name,
TBL.SParam.value('(*)[2]', 'varchar(100)') as Param2Value,
-- Param3
TBL.SParam.value('local-name((*)[3])', 'varchar(50)') as Param3Name,
TBL.SParam.value('(*)[3]', 'varchar(100)') as Param3Value,
-- Param 4
TBL.SParam.value('local-name((*)[4])', 'varchar(50)') as Param4Name,
TBL.SParam.value('(*)[4]', 'varchar(100)') as Param4Value,
-- Param 5
TBL.SParam.value('local-name((*)[5])', 'varchar(50)') as Param5Name,
TBL.SParam.value('(*)[5]', 'varchar(100)') as Param5Value
FROM #xmldoc.nodes('/NewDataSet/Table1') AS TBL(SParam)
I need a way to add to my results the order that they came from the xml file. (Which was the first instance of Table1, then the second....).
For reasons of SQL table variable limitations, I can't use an identity column to keep this straight. (For other reasons, I don't want to use a temporary table.)
I am hoping that there is a cool SQL XML function that will return some kind of internally assigned Node ID. (Or some other similar manner of ordering.)
Note, I do not control this XML Structure (I am a reader only) so I cannot make changes to add in an ID attribute.
Any advice would be great!
EDIT/Update:
I would really like to have this data like this:
1 | SharedParam | shared
1 | Antoher | sahre
1 | RandomParam2 | Good stuff
1 | MoreParam | and more
1 | ResultsParam | and more
2 | RandomParam2 | do you
2 | MoreParam | think
2 | ResultsParam | 2
3 | Sharedparam | Last
3 | Antoher | Set
.
.
.
But I am coming up short. I can get it into columns (more or less), but I don't know how to do the numbering. If you have any ideas I would love to hear them.
EDIT:
I figured out the query to do this (with some help from the internet). It looks like this:
SELECT TBL.SParam.value('local-name(.)[1]', 'varchar(50)') as ParamName,
TBL.SParam.value('(.)[1]', 'varchar(50)') ParamValue,
TBL.SParam.value('for $s in . return count(../*[. << $s]) + 1', 'int') ParamPosition,
TBL.SParam.value('for $s in . return count(../../*[. << $s]) - 1', 'int') ParamIteration
FROM #xmldoc.nodes('/NewDataSet/Table1/*') AS TBL(SParam)

You can use a number table and position()
SELECT N.Number as ID,
-- Param 1
TBL.SParam.value('local-name((*)[1])', 'varchar(50)') as Param1Name,
TBL.SParam.value('(*)[1]', 'varchar(100)') as Param1Value,
-- Param2
TBL.SParam.value('local-name((*)[2])', 'varchar(50)') as Param2Name,
TBL.SParam.value('(*)[2]', 'varchar(100)') as Param2Value,
-- Param3
TBL.SParam.value('local-name((*)[3])', 'varchar(50)') as Param3Name,
TBL.SParam.value('(*)[3]', 'varchar(100)') as Param3Value,
-- Param 4
TBL.SParam.value('local-name((*)[4])', 'varchar(50)') as Param4Name,
TBL.SParam.value('(*)[4]', 'varchar(100)') as Param4Value,
-- Param 5
TBL.SParam.value('local-name((*)[5])', 'varchar(50)') as Param5Name,
TBL.SParam.value('(*)[5]', 'varchar(100)') as Param5Value
FROM master..spt_values as N
cross apply #xmldoc.nodes('/NewDataSet/Table1[position()=sql:column("N.Number")]') AS TBL(SParam)
where N.type = 'P' and
N.number between 1 and #xmlDoc.value('count(/NewDataSet/Table1)', 'int')

Take a look at this connect request to Fully support position() in xquery The requestor offers a couple of work arounds that might be usful to you
Based on the second workaround I wrote the following
SELECT
p.number,
TBL.SParam.value('local-name((*)[1])', 'varchar(50)') as Param1Name,
TBL.SParam.value('(*)[1]', 'varchar(100)') as Param1Value,
-- Param2
TBL.SParam.value('local-name((*)[2])', 'varchar(50)') as Param2Name,
TBL.SParam.value('(*)[2]', 'varchar(100)') as Param2Value,
-- Param3
TBL.SParam.value('local-name((*)[3])', 'varchar(50)') as Param3Name,
TBL.SParam.value('(*)[3]', 'varchar(100)') as Param3Value,
-- Param 4
TBL.SParam.value('local-name((*)[4])', 'varchar(50)') as Param4Name,
TBL.SParam.value('(*)[4]', 'varchar(100)') as Param4Value,
-- Param 5
TBL.SParam.value('local-name((*)[5])', 'varchar(50)') as Param5Name,
TBL.SParam.value('(*)[5]', 'varchar(100)') as Param5Value
FROM
master..spt_values p CROSS APPLY
#xmldoc.nodes('NewDataSet/Table1[position()=sql:column("p.number")]') AS TBL(SParam)
WHERE P.type = 'P'
Which returns the following
number Param1Name Param1Value Param2Name Param2Value Param3Name Param3Value Param4Name Param4Value Param5Name Param5Value
----------- -------------------------------------------------- ---------------------------------------------------------------------------------------------------- -------------------------------------------------- ---------------------------------------------------------------------------------------------------- -------------------------------------------------- ---------------------------------------------------------------------------------------------------- -------------------------------------------------- ---------------------------------------------------------------------------------------------------- -------------------------------------------------- ----------------------------------------------------------------------------------------------------
1 Sharedparam shared Antoher sahre RandomParam2 Good stuff MoreParam and more ResultsParam 2
2 RandomParam2 do you MoreParam think ResultsParam 2 NULL NULL
3 Sharedparam Last Antoher Set RandomParam2 of MoreParam values ResultsParam are here
4 NULL NULL NULL NULL NULL
(4 row(s) affected)

Related

Parse XML file in SQL with duplicated Tags

I'm trying to flatten my XML file however there are complications when there are duplicate tags within a tag. I'm not sure how to move back up the tag chain. Examples below.
DECLARE #xml AS XMl, #hDoc AS INT
SET #xml = '<Message>
<Body>
<SfkpgAcctAndHldgs>
<SfkpgAcct>UHS5465</SfkpgAcct>
<AcctSubLvl>
<Dsclsr>
<SfkpgAcct>450812361</SfkpgAcct>
<AcctHldr>
<LglPrsn>
<NmAndAdr>
<Nm>Foo</Nm>
<Adr>
<AdrLine>1 CORPORATE WAY, New York 15951</AdrLine>
<Ctry>US</Ctry>
</Adr>
</NmAndAdr>
<LEI>5493231HPIWHJS5QL7N39</LEI>
<Ownrsh>OWNR</Ownrsh>
</LglPrsn>
</AcctHldr>
<ShrhldgBal>
<ShrhldgTp>BENE</ShrhldgTp>
<Unit>1000</Unit>
</ShrhldgBal>
</Dsclsr>
<Dsclsr>
<SfkpgAcct>450812362</SfkpgAcct>
<AcctHldr>
<LglPrsn>
<NmAndAdr>
<Nm>Bar</Nm>
<Adr>
<AdrLine>2 CORPORATE WAY</AdrLine>
<AdrLine>New York</AdrLine>
<AdrLine>15951</AdrLine>
<Ctry>US</Ctry>
</Adr>
</NmAndAdr>
<LEI>5493231HP2342345QL7N39</LEI>
<Ownrsh>OWNR</Ownrsh>
</LglPrsn>
</AcctHldr>
<ShrhldgBal>
<ShrhldgTp>BENE</ShrhldgTp>
<Unit>300</Unit>
</ShrhldgBal>
</Dsclsr>
</AcctSubLvl>
</SfkpgAcctAndHldgs>
</Body>
</message>'
EXEC sp_xml_preparedocument #hDoc OUTPUT, #XML
This query takes each SfkpgAcctAndHldgs and brings in the first Dsclsr, there are two but only the first it returned.
SELECT *
FROM OPENXML(#hDoc,'message/Body/SfkpgAcctAndHldgs') WITH (
SfkpgAcct NVARCHAR(MAX) 'SfkpgAcct'
, Dsclsr_SfkpgAcct NVARCHAR(MAX) 'AcctSubLvl/Dsclsr/SfkpgAcct'
, Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Nm NVARCHAR(MAX) 'AcctSubLvl/Dsclsr/AcctHldr/LglPrsn/NmAndAdr/Nm'
, Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine NVARCHAR(MAX) 'AcctSubLvl/Dsclsr/AcctHldr/LglPrsn/NmAndAdr/Adr/AdrLine'
, Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_Ctry NVARCHAR(MAX) 'AcctSubLvl/Dsclsr/AcctHldr/LglPrsn/NmAndAdr/Adr/Ctry'
, Dsclsr_AcctHldr_LglPrsn_LEI NVARCHAR(MAX) 'AcctSubLvl/Dsclsr/AcctHldr/LglPrsn/LEI'
, Dsclsr_AcctHldr_LglPrsn_Ownrsh NVARCHAR(MAX) 'AcctSubLvl/Dsclsr/AcctHldr/LglPrsn/Ownrsh'
, Dsclsr_ShrhldgBal_ShrhldgTp NVARCHAR(MAX) 'AcctSubLvl/Dsclsr/ShrhldgBal/ShrhldgTp'
, Dsclsr_ShrhldgBal_Unit NVARCHAR(MAX) 'AcctSubLvl/Dsclsr/ShrhldgBal/Unit'
)
--SfkpgAcct AcctSubLvl_Dsclsr_SfkpgAcct Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Nm Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_Ctry Dsclsr_AcctHldr_LglPrsn_LEI Dsclsr_AcctHldr_LglPrsn_Ownrsh Dsclsr_AcctHldr_ShrhldgBal_ShrhldgTp Dsclsr_AcctHldr_ShrhldgBal_Unit
--UHS5465 450812361 Foo 1 CORPORATE WAY, New York 15951 US 5493231 HPIWHJS5QL7N39 OWNR BENE 1000
This query looks at the Dsclsr level and so brings back both lines. However I can't go back up the Chain to get the first SfkpgAcct value, which is a unique value I can join the tables on.
SELECT *
FROM OPENXML(#hDoc,'message/Body/SfkpgAcctAndHldgs/AcctSubLvl/Dsclsr') WITH (
SfkpgAcct NVARCHAR(MAX) 'SfkpgAcct'
, AcctHldr_LglPrsn_NmAndAdr_Nm NVARCHAR(MAX) 'AcctHldr/LglPrsn/NmAndAdr/Nm'
, AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine NVARCHAR(MAX) 'AcctHldr/LglPrsn/NmAndAdr/Adr/AdrLine'
, AcctHldr_LglPrsn_NmAndAdr_Adr_Ctry NVARCHAR(MAX) 'AcctHldr/LglPrsn/NmAndAdr/Adr/Ctry'
, AcctHldr_LglPrsn_LEI NVARCHAR(MAX) 'AcctHldr/LglPrsn/LEI'
, AcctHldr_LglPrsn_Ownrsh NVARCHAR(MAX) 'AcctHldr/LglPrsn/Ownrsh'
, ShrhldgBal_ShrhldgTp NVARCHAR(MAX) 'ShrhldgBal/ShrhldgTp'
, ShrhldgBal_Unit NVARCHAR(MAX) 'ShrhldgBal/Unit'
)
--SfkpgAcct AcctHldr_LglPrsn_NmAndAdr_Nm AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine AcctHldr_LglPrsn_NmAndAdr_Adr_Ctry AcctHldr_LglPrsn_LEI AcctHldr_LglPrsn_Ownrsh ShrhldgBal_ShrhldgTp ShrhldgBal_Unit
--450812361 Foo 1 CORPORATE WAY, New York 15951 US 5493231HPIWHJS5QL7N39 OWNR BENE 1000
--450812362 Bar 2 CORPORATE WAY US 5493231HP2342345QL7N39 OWNR BENE 300
Also there are on occasions multiple AdrLine fields however in the same example as above I do not know who to go back up the chain to reference any previous values so I can then join on.
SELECT *
FROM OPENXML(#hDoc,'message/Body/SfkpgAcctAndHldgs/AcctSubLvl/Dsclsr/AcctHldr/LglPrsn/NmAndAdr/Adr/AdrLine') WITH (
AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine NVARCHAR(MAX) '.'
)
--AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine
--1 CORPORATE WAY, New York 15951
--2 CORPORATE WAY
--New York
--15951
EXEC sp_xml_removedocument #hDoc
Ideally, I want all values in a flat file, I can use a view to join the individual tables together providing I can attribute an existing ID or generate a new one.
--SfkpgAcct DsclsrSfkpgAcct Nm AdrLine1 AdrLine2 AdrLine3 Ctry LEI Ownrsh ShrhldgTp Unit
--UHS5465 450812361 Foo 1 CORPORATE WAY, New York 15951 US 5493231HPIWHJS5QL7N39 OWNR BENE 1000
--UHS5465 450812362 Bar 2 CORPORATE WAY New York 15951 US 5493231HP2342345QL7N39 OWNR BENE 300
Microsoft proprietary OPENXML() and its companions sp_xml_preparedocument and sp_xml_removedocument are kept just for backward compatibility with the obsolete SQL Server 2000. Their use is diminished just to very few fringe cases.
Starting from SQL Server 2005 onwards, it is strongly recommended to re-write your SQL and switch it to XQuery.
SQL
DECLARE #xml XML =
N'<Message>
<Body>
<SfkpgAcctAndHldgs>
<SfkpgAcct>UHS5465</SfkpgAcct>
<AcctSubLvl>
<Dsclsr>
<SfkpgAcct>450812361</SfkpgAcct>
<AcctHldr>
<LglPrsn>
<NmAndAdr>
<Nm>Foo</Nm>
<Adr>
<AdrLine>1 CORPORATE WAY, New York 15951</AdrLine>
<Ctry>US</Ctry>
</Adr>
</NmAndAdr>
<LEI>5493231HPIWHJS5QL7N39</LEI>
<Ownrsh>OWNR</Ownrsh>
</LglPrsn>
</AcctHldr>
<ShrhldgBal>
<ShrhldgTp>BENE</ShrhldgTp>
<Unit>1000</Unit>
</ShrhldgBal>
</Dsclsr>
<Dsclsr>
<SfkpgAcct>450812362</SfkpgAcct>
<AcctHldr>
<LglPrsn>
<NmAndAdr>
<Nm>Bar</Nm>
<Adr>
<AdrLine>2 CORPORATE WAY</AdrLine>
<AdrLine>New York</AdrLine>
<AdrLine>15951</AdrLine>
<Ctry>US</Ctry>
</Adr>
</NmAndAdr>
<LEI>5493231HP2342345QL7N39</LEI>
<Ownrsh>OWNR</Ownrsh>
</LglPrsn>
</AcctHldr>
<ShrhldgBal>
<ShrhldgTp>BENE</ShrhldgTp>
<Unit>300</Unit>
</ShrhldgBal>
</Dsclsr>
</AcctSubLvl>
</SfkpgAcctAndHldgs>
</Body>
</Message>';
SELECT c.value('(SfkpgAcct/text())[1]', 'VARCHAR(20)') AS SfkpgAcct
, x.value('(SfkpgAcct/text())[1]', 'VARCHAR(20)') AS DsclsrSfkpgAcct
, x.value('(AcctHldr/LglPrsn/NmAndAdr/Nm/text())[1]', 'VARCHAR(20)') AS Nm
, x.value('(AcctHldr/LglPrsn/NmAndAdr/Adr/AdrLine[1]/text())[1]', 'VARCHAR(50)') AS AdrLine1
, x.value('(AcctHldr/LglPrsn/NmAndAdr/Adr/AdrLine[2]/text())[1]', 'VARCHAR(50)') AS AdrLine2
, x.value('(AcctHldr/LglPrsn/NmAndAdr/Adr/AdrLine[3]/text())[1]', 'VARCHAR(50)') AS AdrLine3
, x.value('(AcctHldr/LglPrsn/NmAndAdr/Adr/Ctry/text())[1]', 'VARCHAR(50)') AS Ctry
, x.value('(AcctHldr/LglPrsn/LEI/text())[1]', 'VARCHAR(50)') AS LEI
, x.value('(AcctHldr/LglPrsn/Ownrsh/text())[1]', 'VARCHAR(50)') AS Ownrsh
, x.value('(ShrhldgBal/ShrhldgTp/text())[1]', 'VARCHAR(50)') AS ShrhldgTp
, x.value('(ShrhldgBal/Unit/text())[1]', 'VARCHAR(50)') AS Unit
FROM #xml.nodes('/Message/Body/SfkpgAcctAndHldgs') AS t1(c)
CROSS APPLY t1.c.nodes('AcctSubLvl/Dsclsr') AS t2(x);
Output
+-----------+-----------------+-----+---------------------------------+----------+----------+------+------------------------+--------+-----------+------+
| SfkpgAcct | DsclsrSfkpgAcct | Nm | AdrLine1 | AdrLine2 | AdrLine3 | Ctry | LEI | Ownrsh | ShrhldgTp | Unit |
+-----------+-----------------+-----+---------------------------------+----------+----------+------+------------------------+--------+-----------+------+
| UHS5465 | 450812361 | Foo | 1 CORPORATE WAY, New York 15951 | NULL | NULL | US | 5493231HPIWHJS5QL7N39 | OWNR | BENE | 1000 |
| UHS5465 | 450812362 | Bar | 2 CORPORATE WAY | New York | 15951 | US | 5493231HP2342345QL7N39 | OWNR | BENE | 300 |
+-----------+-----------------+-----+---------------------------------+----------+----------+------+------------------------+--------+-----------+------+
Dealing with XML in MS SQL server is a pain in the ... IMHO. Anyway:
SELECT SfkpgAcct,
Dsclsr_SfkpgAcct,
Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Nm,
Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine1,
Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine2,
Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine3,
Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_Ctry,
Dsclsr_AcctHldr_LglPrsn_LEI,
Dsclsr_AcctHldr_LglPrsn_Ownrsh,
Dsclsr_ShrhldgBal_ShrhldgTp,
Dsclsr_ShrhldgBal_Unit
FROM
OPENXML(#hDoc, 'message/Body/SfkpgAcctAndHldgs')
WITH
(
SfkpgAcct NVARCHAR(MAX) 'SfkpgAcct',
AcctSubLvl XML 'AcctSubLvl'
) t1
CROSS APPLY
(
SELECT item.row.value('SfkpgAcct[1]', 'nvarchar(max)'),
item.row.value('AcctHldr[1]/LglPrsn[1]/NmAndAdr[1]/Nm[1]', 'nvarchar(max)'),
item.row.value('AcctHldr[1]/LglPrsn[1]/NmAndAdr[1]/Adr[1]/AdrLine[1]', 'nvarchar(max)'),
item.row.value('AcctHldr[1]/LglPrsn[1]/NmAndAdr[1]/Adr[1]/AdrLine[2]', 'nvarchar(max)'),
item.row.value('AcctHldr[1]/LglPrsn[1]/NmAndAdr[1]/Adr[1]/AdrLine[3]', 'nvarchar(max)'),
item.row.value('AcctHldr[1]/LglPrsn[1]/NmAndAdr[1]/Adr[1]/Ctry[1]', 'nvarchar(max)'),
item.row.value('AcctHldr[1]/LglPrsn[1]/LEI[1]', 'nvarchar(max)'),
item.row.value('AcctHldr[1]/LglPrsn[1]/Ownrsh[1]', 'nvarchar(max)'),
item.row.value('ShrhldgBal[1]/ShrhldgTp[1]', 'nvarchar(max)'),
item.row.value('ShrhldgBal[1]/Unit[1]', 'nvarchar(max)')
FROM t1.AcctSubLvl.nodes('AcctSubLvl/Dsclsr') item(row)
) AS nodes(Dsclsr_SfkpgAcct
, Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Nm
, Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine1
, Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine2
, Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_AdrLine3
, Dsclsr_AcctHldr_LglPrsn_NmAndAdr_Adr_Ctry
, Dsclsr_AcctHldr_LglPrsn_LEI
, Dsclsr_AcctHldr_LglPrsn_Ownrsh
, Dsclsr_ShrhldgBal_ShrhldgTp
, Dsclsr_ShrhldgBal_Unit);
DBFiddle demo

attempting a replace function on sql but with wildcards

this is my first time posting so hopefully I get all the information needed across correctly.
I'm attempting to do an update replace statement on a row where there is a set of characters but a wildcard statement won't work, below is a select statement example
SELECT replace(column, '[aA][bB][cC]-', '')
FROM table
where column like '%[aA][bB][cC]-%'
an example row would be "abc-123 aBc-123 abC-abc Def-123", but I need it to return "123 123 abc"
this is a basic example, but I'm trying to get rid of a set of 3 characters and a "-" character anywhere in a string. the abc could change to def but the "-" character will always come after.
I've done some googling and can't find an appropriate solution as most examples will only remove one example of abc- where I need to get rid of all versions. I'm running version 12.0 of sql server (SQL Server 2014) so I think some functions I wouldn't be able to use.
I think the closest example I could find was Using Wildcard For Range of Characters In T-SQL but I can't use the translate function.
Edit: below is an example of a created table
CREATE TABLE String_Removal_Example(
someValue VARCHAR(100)
)
INSERT INTO String_Removal_Example (someValue) VALUES ('abc-123')
INSERT INTO String_Removal_Example (someValue) VALUES ('abc-123 ABC-123')
INSERT INTO String_Removal_Example (someValue) VALUES ('abc-123 ABc-123 123-ABC DEF-123')
select statement brings back
someValue
abc-123
abc-123 ABC-123
abc-123 ABc-123 123-ABC DEF-123
Edit2: If this isn't possible to do in this manner, a possible alternative would be to remove 3 characters and the - character. I've tried the below
select Substring (someValue, Charindex( '-', someValue ) + 1, Len(someValue)) From String_Removal_Example
but this returns the below, which is only affecting the first instance of nnn-.
123
123 ABC-123
123 ABc-123 123-ABC DEF-123
Edit3: The string I need to replace is nnn- for clarification. I was trying for the [aA][bB][cC]- format in case I needed to change it. It will always be 3 characters followed by a "-" Character
You can create a function like this:
(I named it ThreeLetters, but what's in a name...)
CREATE FUNCTION [dbo].[ThreeLetters]
(
#p0 VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #p1 VARCHAR(20) = '%[A-Z][A-Z][A-Z]-%';
DECLARE #Result VARCHAR(MAX) = #p0;
DECLARE #pos INT = PATINDEX(#p1,#Result COLLATE SQL_Latin1_General_CP1_CS_AS);
DECLARE #pos2 INT ;
DECLARE #i INT =0;
WHILE #pos>0 and #i<10
BEGIN
SET #pos2 = CHARINDEX('-',#Result,#pos)-#pos;
SELECT #Result = SUBSTRING(#Result,1,#pos-1)+SUBSTRING(#Result, #pos+#pos2+1, len(#Result));
SET #pos = PATINDEX(#p1,#Result COLLATE SQL_Latin1_General_CP1_CS_AS);
SET #i = #i + 1;
END
RETURN #Result
END
output with your sample data:
someValue
(No column name)
abc-123
123
abc-123 ABC-123
123 123
abc-123 ABc-123 123-ABC DEF-123
123 123 123-ABC 123
see DBFIDDLE
P.S. The check for #i was introduced while debugging. It might not be needed, but was tooo lazy to remove it.
P.P.S I am removing ThreeLetters followed by a -, that is why 123-ABC remains unchanged.
Here's an XML trick that'll also work in Sql Server 2014
(In Sql Server 2016 and beyond, the string_split function would be better.)
declare #String_Removal_Example table (
someValue varchar(100) not null
);
insert into #String_Removal_Example (someValue) VALUES
('abc-123')
, ('abc-123 ABC-123')
, ('abc-123 ABC-123 123-Abc DEF-123');
select ca.value as someValue
from #String_Removal_Example t
cross apply
(
select rtrim((
select node.n.value('.','varchar(30)')+' '
from
(
select cast('<x><a>' + replace(replace(t.someValue,
' ', '</n><a>'),
'-', '</a><n>') +
'</n></x>' as xml) as x
) q
cross apply x.nodes ('/x/n') AS node(n)
for xml path('')
)) as value
) ca
where lower(t.someValue) like '%[a-z0-9]-[a-z0-9]%';
GO
| someValue |
| :-------------- |
| 123 |
| 123 123 |
| 123 123 Abc 123 |
db<>fiddle here
Please try the following solution. It is a variation of #LukStorms solution.
It is based on XML and XQuery. Their data model is based on ordered sequences.
XML allows us to tokenize the input string.
It is looking for 'nnn-' strings and filters them out.
For example, 2nd row is converted to XML like this:
<root>
<r>abc-</r>
<r>123</r>
<r>ABC-</r>
<r>123</r>
</root>
SQL
-- DDL and sample data population, start
declare #String_Removal_Example table (
someValue varchar(100) not null
);
insert into #String_Removal_Example (someValue) VALUES
('abc-123')
, ('abc-123 ABC-123')
, ('abc-123 ABC-123 123-Abc DEF-123');
-- DDL and sample data population, end
SELECT someValue AS [before], [after]
FROM #String_Removal_Example
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' +
REPLACE(REPLACE(someValue, '-', '- '), SPACE(1), ']]></r><r><![CDATA[') +
']]></r></root>' AS XML)
.query('data(/root/r[not(contains(text()[1],"-"))])')
.value('.','VARCHAR(255)')) AS t1(after);
Output
+---------------------------------+-----------------+
| before | after |
+---------------------------------+-----------------+
| abc-123 | 123 |
| abc-123 ABC-123 | 123 123 |
| abc-123 ABC-123 123-Abc DEF-123 | 123 123 Abc 123 |
+---------------------------------+-----------------+
This is basically dividing up the string where there are spaces, then looking for "words" that have a '-' in the 4th position, removing the prefix and then stitching it back together:
SELECT s.someValue,
STUFF((SELECT ' ' + SUBSTRING(value, 5, LEN(value) - 4)
FROM STRING_SPLIT(someValue, ' ')
WHERE CHARINDEX('-', value) = 4
FOR XML PATH('')),1,1,'') a
FROM String_Removal_Example s
As there is no regex replace in SQL Server, I think the only way you can reliably use a pattern replace is through an iterative process. One such way is with a recursive CTE:
WITH Data AS
( SELECT SomeValue
FROM (VALUES
('abc-123'),
('abc-123 ABC-123'),
('abc-123 aBc-123 abC-abc Def-123'),
('abc-123 ABCD-123'),
('abc-123 A$C-123')
) x (SomeValue)
), RegexReplace AS
( SELECT SomeValue, NewValue = CONVERT(VARCHAR(100), Data.SomeValue), Level = 1
FROM Data
UNION ALL
SELECT SomeValue, CONVERT(VARCHAR(100), STUFF(NewValue, PATINDEX('% [A-z][A-z][A-z]-%', CONCAT(' ', NewValue)), 4, '')), Level + 1
FROM RegexReplace AS d
WHERE PATINDEX('% [A-z][A-z][A-z]-%', CONCAT(' ', NewValue)) > 0
), RankedData AS
( SELECT *, RowNumber = ROW_NUMBER() OVER(PARTITION BY rr.SomeValue ORDER BY rr.Level DESC)
FROM RegexReplace AS rr
)
SELECT rd.SomeValue, rd.NewValue
FROM RankedData AS rd
WHERE rd.RowNumber = 1;
Which gives:
SomeValue
New Value
abc-123
123
abc-123 ABC-123
123 123
abc-123 aBc-123 abC-abc Def-123
123 123 abc 123
abc-123 A$C-123
123 A$C-123
abc-123 ABCD-123
123 ABCD-123
This basically uses PATINDEX() to find the first instance of the pattern, then STUFF() to remove the 4 characters after the index found. This process then repeats until the PATINDEX() fails to find a match.
Probably also of note is that when doing the PATINDEX() I am adding a space to the start of the string, and including a space at the start of the pattern. This ensures that it is limited to only 3 characters followed by a hyphen, if it is more than 3 (e.g. ABCD-123) the entire string is left intact rather than leaving A123 as shown in the last row above. Adding the space to the start of the string being searched ensures that the pattern is still found even if it is at the start of the string.
SELECT replaceAll(column, '[aA][bB][cC]-', '')
FROM table
where column like '%[aA][bB][cC]-%'

XQuery SQL XML values

We have to find a query parsing XML to have a result like this one:
COL_ACTION N COL_NAME_I COL_VALUE_AFTER
---------- ----------- ---------------- -------------------
INS 1 N 1
INS 1 TST_ID 28
INS 1 TST_DATA data2
INS 2 N 2
INS 2 TST_ID 27
INS 2 TST_DATA data1
The value of column N depend on the value of the first "row" (COL_NAME_I = N) of the XML dataset
The XML contains:
DECLARE #XML XML =
N'
<DATASET>
<XML_INS>
<IROW xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<N>1</N>
<TST_ID>28</TST_ID>
<TST_DATA>data2</TST_DATA>
</IROW>
<IROW xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<N>2</N>
<TST_ID>27</TST_ID>
<TST_DATA>data1</TST_DATA>
</IROW>
</XML_INS>
</DATASET>
';
The query I was able to do is:
SELECT RIGHT(ca.value('local-name(.)','CHAR(7)'), 3) AS COL_ACTION,
x.value('(//N)[1]', 'int') AS N,
-- cn.value('./text()[1]','int') AS NI,
ci.value('local-name(.)','sysname') AS COL_NAME_I,
ci.value('./text()[1]','nvarchar(max)') AS COL_VALUE_AFTER
FROM #XML.nodes('.') AS T(x)
OUTER APPLY #XML.nodes('DATASET/*') AS TA(ca)
OUTER APPLY #XML.nodes('DATASET/XML_INS/IROW/N/*') AS TN(cn)
OUTER APPLY #XML.nodes('DATASET/XML_INS/IROW/*') AS TI(ci);
The problem remains on the "N" column that does no have the expected value:
COL_ACTION N COL_NAME_I COL_VALUE_AFTER
---------- ----------- ---------------- -------------------
INS 1 N 1
INS 1 TST_ID 28
INS 1 TST_DATA data2
INS 1 N 2
INS 1 TST_ID 27
INS 1 TST_DATA data1
I tryed different approach, as seen in SQL comment, but does not produce the good result...
You need each .nodes to refer to the previous, in order to only bring back its children. Otherwise you get every descendant node for the whole document.
SELECT RIGHT(action.value('local-name(.)','CHAR(7)'), 3) AS COL_ACTION,
irow.value('(N/text())[1]', 'int') AS N,
ci.value('local-name(.)','sysname') AS COL_NAME_I,
ci.value('text()[1]','nvarchar(max)') AS COL_VALUE_AFTER
FROM #XML.nodes('DATASET/*') AS x1(action)
OUTER APPLY x1.action.nodes('IROW') AS x2(irow)
OUTER APPLY x2.irow.nodes('*') AS TI(ci);
db<>fiddle
This should do it:
SELECT
irow.value('local-name(..)', 'NVARCHAR(100)') AS COL_ACTION,
irow.value('(./N)[1]', 'NVARCHAR(100)') AS N,
child.value('local-name(.)', 'NVARCHAR(100)') AS COL_NAME_I,
child.value('.', 'NVARCHAR(100)') AS COL_VALUE_AFTER
FROM #XML.nodes('/DATASET/*/IROW') AS n1(irow)
CROSS APPLY irow.nodes('./*') AS n2(child)
i am a new contributer so sorry if i got something wrong.
I used this approach on my work : Link
My sample working code :
CREATE TABLE #FieldPermissionsTable
(
TE_ID VARCHAR(MAX),
Value VARCHAR(MAX),
Value2 VARCHAR(MAX),
NAVI_USER VARCHAR(MAX)
)
--SET #FieldPermissionAccessXML = '<Row_ID><Elements ID="1" Value="1" Value2="NULLnROLLA" Navi_User="testuser" /><Elements ID="3" Value="tespit" Value2="NULLnROLLA" Navi_User="testuser" /><Elements ID="6" Value="aciklama" Value2="NULLnROLLA" Navi_User="testuser" /></Row_ID>'
-- SELECT #FieldPermissionAccessXML = [dbo].[to_xml_replace] (#FieldPermissionAccessXML)
EXEC sp_xml_preparedocument #XmlHandle output,#FieldPermissionAccessXML
--'<FieldPermissions>
--<FieldPermission FieldName="" RoleID="1" Access="1" />
--<FieldPermission FieldName="" RoleID="2" Access="1" />'
--select * from JMP_FieldPermissions
INSERT INTO #FieldPermissionsTable
SELECT *--ID,Value, Value2, Navi_User
FROM OPENXML (#XmlHandle, './Row_ID/Elements')
WITH (TE_ID VARCHAR(MAX) '#ID',
Value VARCHAR(MAX) '#Value',
Value2 VARCHAR(MAX) '#Value2',
NAVI_USER VARCHAR(MAX) '#Navi_User'
)

SQL Script to Break up XML into Columns

I have a table that stores audit history in an XML format. Each document has its own history. How can I parse out he XML data per document whereby each column in the XML represents and an actual column and action that took place in that column.
Example:
<auditElement><field id="2881159" type="5" name="Responsiveness" formatstring=""><unSetChoice>2881167</unSetChoice><setChoice>2881166</setChoice></field></auditElement>
UnsetChoice and Set Choice are the columns.
Name=represents the action.
Xml can be parsed using features such as XQuery, or OpenXml.
Here's an example of xquery:
declare #xml as xml = '<auditElement><field id="2881159" type="5" name="Responsiveness"' +
' formatstring=""><unSetChoice>2881167</unSetChoice>' +
'<setChoice>2881166</setChoice></field></auditElement>';
SELECT
Nodes.node.value('(field/#id)[1]', 'INT') AS FieldId,
Nodes.node.value('(field/#name)[1]', 'varchar(50)') AS FieldName,
Nodes.node.value('(field/unSetChoice/text())[1]', 'INT') AS OldValue,
Nodes.node.value('(field/setChoice/text())[1]', 'INT') AS NewValue
FROM
#xml.nodes('//auditElement') AS Nodes(node);
Result:
FieldId FieldName OldValue NewValue
----------- -------------------------------------------------- ----------- -----------
2881159 Responsiveness 2881167 2881166
You can use OUTER APPLY and then break down the parts of the XML field.
The value() method takes XQuery expressions.
For example:
DECLARE #T TABLE (Id int identity(1,1) primary key, XmlCol1 xml);
insert into #T (XmlCol1) values
('<auditElement><field id="2881159" type="5" name="Responsiveness" formatstring=""><unSetChoice>2881167</unSetChoice><setChoice>2881166</setChoice></field></auditElement>'),
('<auditElement><field id="2881160" type="6" name="Responsiveness" ><unSetChoice>2881187</unSetChoice><setChoice>2881188</setChoice></field></auditElement>');
select *
from (
select
Id,
a.field.value('#id', 'int') as xmlid,
a.field.value('(unSetChoice)[1]', 'int') as unSetChoice,
a.field.value('(setChoice)[1]', 'int') as setChoice,
a.field.value('#type', 'int') as type,
a.field.value('#name', 'varchar(max)') as name,
a.field.value('#formatstring', 'varchar(max)') as formatstring
from #T t
outer apply t.XmlCol1.nodes('/auditElement/field') as a(field)
where a.field.value('#id', 'int') > 0 and Id > 0
) q
where name = 'Responsiveness';
Result:
Id xmlid unSetChoice setChoice type name formatstring
1 2881159 2881167 2881166 5 Responsiveness
2 2881160 2881187 2881188 6 Responsiveness NULL

How to extract xml string to make a new colums in sql using XQuery/XPath?

Hi I have here my code for now and Im not getting the result correctly using this codes;
Declare #txt varchar(max) = (
select audit_val from AuditTable
where audit_val like '%update_dt%'
and audit_val like '%PEK150700019-001%')
Declare #xml XML = CONVERT(XML,#txt)
SELECT
#xml.value('(/i/#guest_id)[1]', 'varchar(100)') as guest_id
, #xml.value('(/i/#updated_by)[1]', 'varchar(100)') as updated_by
, #xml.value('(/i/#update_dt)[1]', 'datetime') as update_dt
what I am getting is null values and it says it returned more than one value,
using select top 1 will do, but I need to extract all of the string needed on that column from the table rows, can someone help me how to fix this ? thanks in advance :)
Note** heres a sample output;
=================================================================
|guest_id | updated_by | update_dt |
|==================|==================|=========================|
|PEK150700019-001 | sherwin | 2015-08-11T13:59:09.550|
|------------------|------------------|-------------------------|
This is one possible way :
declare #auditTable TABLE(audit_val varchar(1000))
insert into #auditTable values('<i guest_id="PEK150700019-001" updated_by="sherwin" update_dt="2015-08-11T13:59:09.550" place_birth="SHANGHAI"/>')
select
(CONVERT(XML,audit_val)).value('(/i/#guest_id)[1]', 'varchar(100)') as guest_id
, (CONVERT(XML,audit_val)).value('(/i/#updated_by)[1]', 'varchar(100)') as updated_by
, (CONVERT(XML,audit_val)).value('(/i/#update_dt)[1]', 'datetime') as update_dt
from #auditTable
where audit_val like '%update_dt%'
and audit_val like '%PEK150700019-001%'
Sqlfiddle Demo
Use XML data type in the first place for storing XML data. That way you can avoid having to do multiple time casting varchar to XML which may improve the query performance.
output :
| guest_id | updated_by | update_dt |
|------------------|------------|--------------------------|
| PEK150700019-001 | sherwin | August, 11 2015 13:59:09 |