So I have this bit of sql that grabs a XML nodes whose content is encoded html. I then converted that into a varchar and "decode" it and cast it back into an XML data type. The problem that I have is when I call the nodes function it says that "nodes" is not a valid function, property, or field.. The weird thing is that if use the other functions query, value,exist, and modify it does not complain. Any ideas as to why?
Declare #XmlEl As XML
DECLARE #htmlString As varchar(max)
Select #XmlEl = CAST(replace(CAST(Body AS VARCHAR(MAX)), 'utf-16', 'utf-8') AS xml) FROM Templates where TemplateID = 3119
Set #XmlEl = #XmlEl.query('/PdfTemplate/PdfBody').query('string(/)')
Select CAST(#XmlEl As varchar(max))
Set #htmlString = Replace(CAST(#XmlEl As varchar(max)), '<', '<')
Set #XmlEl = CAST(Replace(#htmlString, '>', '>') AS XML)
Select #XmlEl.nodes('p/span')
I do not think you can just do a SELECT off of #XmlEl.nodes like that. I believe you need something more like
Select x.a.query('.') FROM #XmlEl.nodes('p/span') x(a)
Referenced from MSDN nodes()
Related
I've got a string like AAAA.BBB.CCCC.DDDD.01.A and I'm looking to manipulate this and end up with AAAA-BBB
I've achieved this by writing this debatable piece of code
declare #string varchar(100) = 'AAAA.BBB.CCCC.DDDD.01.A'
select replace(substring(#string,0,charindex('.',#string)) + substring(#string,charindex('.',#string,CHARINDEX('.',#string)),charindex('.',#string,CHARINDEX('.',#string)+1)-charindex('.',#string)),'.','-')
Is there any other way to achieve this which is more elegant and readable ?
I was looking at some string_split operations, but can't wrap my head around it.
If you are open to some JSON transformations, the following approach is an option. You need to transform the text into a valid JSON array (AAAA.BBB.CCCC.DDDD.01.A is transformed into ["AAAA","BBB","CCCC","DDDD","01","A"]) and get the required items from this array using JSON_VALUE():
Statement:
DECLARE #string varchar(100) = 'AAAA.BBB.CCCC.DDDD.01.A'
SET #string = CONCAT('["', REPLACE(#string, '.', '","'), '"]')
SELECT CONCAT(JSON_VALUE(#string, '$[0]'), '-', JSON_VALUE(#string, '$[1]'))
Result:
AAAA-BBB
Notes: With this approach you can easily access all parts from the input string by index (0-based).
I think this is a little cleaner:
declare #string varchar(100) = 'AAAA.BBB.CCCC.DDDD.01.A'
select
replace( -- replace '.' with '-' (A)
substring(#string, 1 -- in the substring of #string starting at 1
,charindex('.', #string -- and going through 1 before the index of '.'(B)
,charindex('.',#string)+1) -- that is after the first index of the first '.'
-1) -- (B)
,'.','-') -- (A)
Depending on what is in your string you might be able to abuse PARSENAME into doing it. Intended for breaking up names like adventureworks.dbo.mytable.mycolumn it works like this:
DECLARE #x as VARCHAR(100) = 'aaaa.bbb.cccc.ddddd'
SELECT CONCAT( PARSENAME(#x,4), '-', PARSENAME(#x,3) )
You could also look at a mix of STUFF to delete the first '.' and replace with '-' then LEFT the result by the index of the next '.' but it's unlikely to be neater than this or Kevin's proposal
Using string split would likely be as unwieldy:
SELECT CONCAT(MAX(CASE WHEN rn = 1 THEN v END), '-', MAX(CASE WHEN rn = 2 THEN v END))
FROM (
SELECT row_number () over (order by (select 0)) rn, value as v
FROM string_split(#x,'.')
) y WHERE rn IN (1,2)
Because the string is split to rows which then need to be numbered in order to filter and pull the parts you want. This also relies on the strings coming out of string split in the order they were in the original string, which MS do not guarantee will be the case
I have a string column in my table that contains 'Character-separated' data such as this:
"Value|Data|4|Z|11/06/2012"
This data is fed into a 'parser' and deserialised into a particular object. (The details of this aren't relevant and can't be changed)
The structure of my object has changed and now I would like to get rid of some of the 'sections' of data
So I want the previous value to turn into this
"Value|Data|11/06/2012"
I was hoping I might be able to get some help on how I would go about doing this in T-SQL.
The data always has the same number of sections, 'n' and I will want to remove the same sections for all rows , 'n-x and 'n-y'
So far I know I need an update statement to update my column value.
I've found various ways of splitting a string but I'm struggling to apply it to my scenario.
In C# I would do
string RemoveSecitons(string value)
{
string[] bits = string.split(value,'|');
List<string> wantedBits = new List<string>();
for(var i = 0; i < bits.Length; i++)
{
if ( i==2 || i==3) // position of sections I no longer want
{
continue;
}
wantedBits.Add(bits[i]);
}
return string.Join(wantedBits,'|');
}
But how I would do this in SQL I'm not sure where to start. Any help here would be appreciated
Thanks
Ps. I need to run this SQL on SQL Server 2012
Edit: It looks like parsing to xml in some manner could be a popular answer here, however I can't guarantee my string won't have characters such as '<' or '&'
Using NGrams8K you can easily write a nasty fast customized splitter. The logic here is based on DelimitedSplit8K. This will likely outperform even the C# code you posted.
DECLARE #string VARCHAR(8000) = '"Value|Data|4|Z|11/06/2012"',
#delim CHAR(1) = '|';
SELECT newString =
(
SELECT SUBSTRING(
#string, split.pos+1,
ISNULL(NULLIF(CHARINDEX(#delim,#string,split.pos+1),0),8000)-split.pos)
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY d.Pos), d.Pos
FROM
(
SELECT 0 UNION ALL
SELECT ng.position
FROM samd.ngrams8k(#string,1) AS ng
WHERE ng.token = #delim
) AS d(Pos)
) AS split(ItemNumber,Pos)
WHERE split.ItemNumber IN (1,2,5)
ORDER BY split.ItemNumber
FOR XML PATH('')
);
Returns:
newString
----------------------------
"Value|Data|11/06/2012"
Not the most elegant way, but works:
SELECT SUBSTRING(#str,1, CHARINDEX('|',#str,CHARINDEX('|',#str,1)+1)-1)
+ SUBSTRING(#str, CHARINDEX('|',#str,CHARINDEX('|',#str,CHARINDEX('|',#str,CHARINDEX('|',#str,1)+1)+1)+1), LEN(#str))
----------------------
Value|Data|11/06/2012
You might try some XQuery:
DECLARE #s VARCHAR(100)='Value|Data|4|Z|11/06/2012';
SELECT CAST('<x>' + REPLACE(#s,'|','</x><x>') + '</x>' AS XML)
.value('concat(/x[1],"|",/x[2],"|",/x[5])','nvarchar(max)');
In short: The value is trasformed to XML by some string replacements. Then we use the XQuery-concat to bind the first, the second and the fifth element together again.
This version is a bit less efficient but safe with forbidden characters:
SELECT CAST('<x>' + REPLACE((SELECT #s AS [*] FOR XML PATH('')),'|','</x><x>') + '</x>' AS XML)
.value('concat(/x[1],"|",/x[2],"|",/x[5])','nvarchar(max)')
Just to add a non-xml option for fun:
Edit and Caveat - In case anyone tries this for a different solution and doesn't read the comments...
HABO rightly noted that this is easily broken if any of the columns have a period (".") in them. PARSENAME is dependent on a 4 part naming structure and will return NULL if that is exceeded. This solution will also break if any values ever contain another pipe ("|") or another delimited column is added - the substring in my answer is specifically there as a workaround for the dependency on the 4 part naming. If you are trying to use this solution on, say, a variable with 7 delimited columns, it would need to be reworked or scrapped in favor of one of the other answers here.
DECLARE
#a VARCHAR(100)= 'Value|Data|4|Z|11/06/2012'
SELECT
PARSENAME(REPLACE(SUBSTRING(#a,0,LEN(#a)-CHARINDEX('|',REVERSE(#a))+1),'|','.'),4)+'|'+
PARSENAME(REPLACE(SUBSTRING(#a,0,LEN(#a)-CHARINDEX('|',REVERSE(#a))+1),'|','.'),3)+'|'+
SUBSTRING(#a,LEN(#a)-CHARINDEX('|',REVERSE(#a))+2,LEN(#a))
Here is a quick way to do it.
CREATE FUNCTION [dbo].StringSplitXML
(
#String VARCHAR(MAX), #Separator CHAR(1)
)
RETURNS #RESULT TABLE(id int identity(1,1),Value VARCHAR(MAX))
AS
BEGIN
DECLARE #XML XML
SET #XML = CAST(
('<i>' + REPLACE(#String, #Separator, '</i><i>') + '</i>')
AS XML)
INSERT INTO #RESULT
SELECT t.i.value('.', 'VARCHAR(MAX)')
FROM #XML.nodes('i') AS t(i)
WHERE t.i.value('.', 'VARCHAR(MAX)') <> ''
RETURN
END
GO
SELECT * FROM dbo.StringSplitXML( 'Value|Data|4|Z|11/06/2012','|')
WHERE id not in (3,4)
Note that using a UDF will slow things down, so this solution should be considered only if you have a reasonably small data set to work with.
I'm trying to convert a string to rows using T-SQL. I've found some people using XML for this but I'm running into troubles.
The original record:
A new line seperated string of data
New In Progress Left Message On Hold Researching Researching (2nd Level) Researching (3rd Level) Resolved Positive False Positive Security Respond
Using the following statement converts this string into XML:
select
cast('<i>'+REPLACE(convert(varchar(max), list_items), CHAR(13) + CHAR(10),'</i><i>')+'</i>' as xml)
from
field
where
column_name = 'state' and table_name = 'sv_inquiry'
XML string:
<i>Unassigned</i><i>Assigned</i><i>Transferred</i><i>Accepted</i><i>Closed</i><i>Reactivated</i>
Now I would like to convert every 'i' node into a separate row. I've constructed the query below, but I can't get it working in the way that it returns all the rows...
select x.i.value('i[1]', 'varchar(30)')
from (
select cast('<i>'+REPLACE(convert(varchar(max), list_items), CHAR(13) + CHAR(10),'</i><i>')+'</i>' as xml)
from field
where column_name='state' and table_name='sv_inquiry'
) x(i)
This will return
Unassigned
To be clear, when i change 'i[1]' into 'i[2]' it will return 'Assigned'. I've tried '.', this will return the whole string in a single record...
How about using the nodes method on an XML datatype.
declare #xml xml
set #xml = '<i>Unassigned</i><i>Assigned</i><i>Transferred</i><i>Accepted</i><i>Closed</i><i>Reactivated</i>'
select
t.c.value('.', 'nvarchar(100)') as [Word]
from
#xml.nodes('/i') as t(c)
You can split a string into rows without XML, see for example the fnSplitString function at SQL Server Central.
Here's an example using the nodes() function of the xml type. I'm using a space as the delimiter because SQL Fiddle doesn't play well with line feeds:
select node_column.value('.', 'varchar(max)')
from (
select cast('<i>' + replace(list_items, ' ', '</i><i>') +
'</i>' as xml) xml_value
from field
) f
cross apply
xml_value.nodes('/i') node_table(node_column);
Live example at SQL Fiddle.
Ok, the problem is that there's a merger or join that needs to be done on 2 tables. One has file content stored as an [image] type or varbinary(max), the other has the file content stored as a hex string. if I upload the same content into both tables
the content as string (bytearray to string) would look like like this...
'application/vnd.xfdl;content-encoding="base64-gzip"
H4sIAAAAAAAAC+y9e1fjONI4/H9/Cg173idwFgIJl+5m6MzPJAayE+KsnXQPs8+cHJMY8HZi57ET
aObMh3918UW2Jcdyrmbg7E7HtqpUpSqVSqWSdPHLj/EIPBuOa9rWl51K+WgHGNbAHprW45edpqYc
fPp0+vmgsvNL7cPFb1eNFoDlLffLztN0Ojk/PHx5eSl3Zo4hDx+N8sAeH6Iyh2fl0x1S8Hwwc6f2'
...
the content as image looks like (and this is ultimately what I want it to look like)
0x6170706C69636174696F6E
if I do select convert(varbinary(MAX), #contentAsString) I get 0x6100700070006C00690063006100740069006F006E
it appears as though the conversion is on target but putting two zeros (00) between each, I'll call it a byte for lack of better words.
I've tried all sorts of more complicated methods posted across forums but to no avail.
Any help would be appreciated.
From MSDN
In SQL Server 2008, these conversions are even more easier since we
added support directly in the CONVERT built-in function. The code
samples below show how to perform the conversion(s):
declare #hexstring varchar(max);
set #hexstring = '0xabcedf012439';
select CONVERT(varbinary(max), #hexstring, 1);
set #hexstring = 'abcedf012439';
select CONVERT(varbinary(max), #hexstring, 2);
go
declare #hexbin varbinary(max);
set #hexbin = 0xabcedf012439;
select
CONVERT(varchar(max), #hexbin, 1),
CONVERT(varchar(max), #hexbin, 2);
go
Ok, so the padded 00 has been answered.
DECLARE #hexStringNVar nvarchar(max)
DECLARE #hexStringVAR varchar(max)
SET #hexStringNVar = '{my hex string as described above}'
SET #hexStringVAR = '{my hex string as described above}'
select CONVERT(varbinary(MAX), #hexStringNVar)) = 0x6100700070006C00690063...
select CONVERT(varbinary(MAX), #hexStringVAR)) = 0x6170706C6963...
The 00 padding is because of Unicode or NVARCHAR as opposed to VARCHAR.
So, since the stored data is in nvarchar(max), the solution is this:
select CAST(cast(#hexStringNVar as varchar(max)) as varbinary(max)) = 0x6170706C6963...
I'm sure that convert would work just as well but my target SQL Server is 2005.
I know how to select a value from an XML field using xpath and defining namespaces, but I need to use several xpath queries and assign them to my selection. Is there an easier way than doing the following:
SELECT
id, name,
[XML].value('declare namespace test="http://www.test.org/xml/";
declare namespace test2="http://www.test2.org";
(//test:Address[1][test2:Global=1]/test:Street)[1] ', 'varchar(max)') AS streetLocation1,
[XML].value('declare namespace test="http://www.test.org/xml/";
declare namespace test2="http://www.test2.org";
(//test:Address[2][test2:Global=1]/test:Street)[1] ', 'varchar(max)') AS streetLocation2
FROM
TEST
I want to replace
'declare namespace test="http://www.test.org/xml/";
declare namespace test2="http://www.test2.org";'
by using a variable. I tried to append strings but I got the following:
The argument 1 of the XML data type method "value" must be a string literal.
There has to be an easier way.
Thanks,
-James
Thanks #MikaelEriksson,
In case anyone has a similar issues. Here is the answer.
;WITH XMLNAMESPACES ('http://www.test.org/xml/' as test, 'http://www.test.org/xml/' as test2)
SELECT id,
name,
[XML].value('(//test:Address[1][test2:Global=1]/test:Street)[1] ', 'varchar(max)') AS streetLocation1,
[XML].value('(//test:Address[2][test2:Global=1]/test:Street)[1] ', 'varchar(max)') AS streetLocation2
FROM TEST