Why does SQL display an & as &? [duplicate] - sql

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
help with FOR XML PATH('') escaping “special” characters
I need some assistance, my query is below:
STUFF(
(
SELECT ',' + CountDesc
FROM Count INNER JOIN ProjectCount ON Count.Id = ProjectCount.CountId
WHERE ProjectCount.ProjectId = Project.Id ORDER BY Count.CountDesc
FOR XML PATH('')
), 1, 1, '') as [Country]
What happens is when i run this query and the Count table has an & in one of its fields, it displays the & as &.
Is there anyway to not let this happen?
Thanks in advance.

It is happening because the strings being combined in the XML statement are using XML specific characters. In addition to &, the also affects < and >, and probably other characters.
I usually fix this be doing a replace after the call:
select #str = replace(#str, '&', '&')
And nesting the replaces for additional characters.

Per Section 2.4 of the XML spec, & must be escaped except for in a few special cases (e.g. within a comment or CDATA section). If the & wasn't displayed as &, the XML would be invalid.

Related

In SQL Server, how can I identify "double" strings and correct?

How can I find strings in a column that are doubled-up and correct them? I feel like there is an easy answer to this I just can't think of it.
Example:
I want to find instances of a repeating string, example "SolonSolon", and then update the column to "Solon".
Update:
They're always the same. No extra characters, but might have a space as part of the repeating value. Other examples would be...
"PlacePlace", "TreeTree", "OrangeOrange", "TravisMemorialHSTravisMemorialHS", "Texas HSTexas HS"
You can check if the string is equal to the first half replicated.
SELECT LEFT(YourCol,LEN(REPLACE(YourCol, ' ', 'x'))/2)
FROM YourTable
WHERE YourCol = REPLICATE(LEFT(YourCol,LEN(REPLACE(YourCol, ' ', 'x'))/2),2)
The reason for the REPLACE of spaces with x before calculating the LEN is because trailing spaces are ignored by this function. You can also use the technique in #lptr's answer for this but an edge case will be if the string was varchar(8000) and already 8000 characters long in which case concatenating an extra character won't do anything (LEN(SPACE(8000) + 'x') is 0).
..replace the first half of the value with an empty string..if there is nothing left..the value consists of two equal parts
select *, substring(c, 1, (len(c+'.')-1)/2)
from
(
values
('solosolo'), ('yoyo'), ('andand'), ('1212'),(' . .'),
('ababc'), ('onetwoone')
) as t(c)
where replace(c, substring(c, 1, (len(c+'.')-1)/2), '') = '';
Another alternative. The query removes inner spaces using REPLACE(str_col, ' ', ''), removes leading/traling spaces using TRIM, and checks to make sure the first half of the string equals the second half.
select left(no_spaces.str_col, v.str_len/2)
from foo f
cross apply (values (replaced trim(f.str_col), ' ', '')) no_spaces(str_col)
cross apply (values (len(no_spaces.str_col))) v(str_len)
where no_spaces.str_col=replicate(left(f.str_col, v.str_len/2), 2);

Remove all spaces and combine multiple lines to single line in SQL

What is the best way to remove all spaces from a string in SQL Server 2014?
My string is:
Maximize your productivity for building engaging,
beautiful web mapping applications
Trying to remove enter and tab spaces between string and 1 space between words. Result should be like:
Maximize your productivity for building engaging, beautiful web mapping applications
You can use replace(), but it is a bit tricky:
select replace(replace(replace(replace(replace(col, ' ', '<>'
), '
', '<>'
), ' ', '<>' -- tab goes here
), '><', ''
), '<>', ' '
)
from t;
The idea is that a space is replaced with <>, then all ><s are removed leaving only one. For instance:
a b c -- original data with three spaces and one space
a<><><>b<>c -- after replacing spaces with <>
a<>b<>c -- after removing ><
a b c -- after replacing <> with a space
If open to a UDF, the following will remove all control characters and repeating spaces.
The removal of the repeating spaces was inspired (OK stolen) from Gordon's answer a few months ago.
Example
Declare #S varchar(max) = 'Maximize your productivity for building engaging,
beautiful web mapping applications'
Select [dbo].[svf-Str-Strip-Control](#S)
Returns
Maximize your productivity for building engaging, beautiful web mapping applications
The UDF if Interested
CREATE FUNCTION [dbo].[svf-Str-Strip-Control](#S varchar(max))
Returns varchar(max)
Begin
Select #S=Replace(#S,char(n),' ')
From (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31) ) N(n)
Return LTrim(RTrim(Replace(Replace(Replace(#S,' ','><'),'<>',''),'><',' ')))
End
--Select [dbo].[svf-Str-Strip-Control]('Michael '+char(13)+char(10)+'LastName') --Returns: Michael LastName

Find the total word count in each row of a table SQL [duplicate]

This question already has answers here:
Using SQL to determine word count stats of a text field
(5 answers)
Closed 8 years ago.
I have a table with 2 columns ,ID and comment. Is it possible to find the total word count of the comment column of each row?
and then find the TOP 10 word counts? I having been trying and failing all afternoon. Any help would be hugely appreciated. If you would like any more info please ask.
If you just need to do this quickly, you could try the query below. Note that it just crudely uses a space for word boundaries.
SELECT TOP(10)(LEN(comment) - LEN(REPLACE(comment, ' ', '')) + 1)
FROM tblComments
ORDER BY (LEN(comment) - LEN(REPLACE(comment, ' ', ''))) DESC
However, please note that this query isn't a particularly efficient solution and I would only use it if it was an ad-hoc problem I was trying to solve. If performance is an issue, such as the query needing to be built into a live transactional application of some sort, I suggest that a better approach would be to use some combination of a third column to store the word count and/or doing the word count in code. Doing so will also provide a better separation of logic and data storage, as well as giving you more flexibility in how words are recognized.
Try using the DATALENGTH() function in SQLSERVER:
DECLARE #String VARCHAR(100)
,#CharToFind VARCHAR(1)
SET #String = 'AAAA BBBCB NNNNN NEEEEE ERERERERERE '
SET #CharToFind = ' '
select DATALENGTH(#String)
SELECT CountOfWordsInTheString = DATALENGTH(#String) - (DATALENGTH(REPLACE(#String,#CharToFind,''))+1)
Your query would look like the below:
SELECT TOP 10 id,DATALENGTH(comment) - ( DATALENGTH(Replace(comment, ' ', '')) + 1 )
FROM tblname
ORDER BY DATALENGTH(comment) - ( DATALENGTH(Replace(comment, ' ', '')) + 1 ) DESC

SQL Query - Concatenating Results into One String [duplicate]

This question already has answers here:
How to concatenate text from multiple rows into a single text string in SQL Server
(47 answers)
Closed 7 years ago.
I have a sql function that includes this code:
DECLARE #CodeNameString varchar(100)
SELECT CodeName FROM AccountCodes ORDER BY Sort
I need to concatenate all results from the select query into CodeNameString.
Obviously a FOREACH loop in C# code would do this, but how do I do it in SQL?
If you're on SQL Server 2005 or up, you can use this FOR XML PATH & STUFF trick:
DECLARE #CodeNameString varchar(100)
SELECT
#CodeNameString = STUFF( (SELECT ',' + CodeName
FROM dbo.AccountCodes
ORDER BY Sort
FOR XML PATH('')),
1, 1, '')
The FOR XML PATH('') basically concatenates your strings together into one, long XML result (something like ,code1,code2,code3 etc.) and the STUFF puts a "nothing" character at the first character, e.g. wipes out the "superfluous" first comma, to give you the result you're probably looking for.
UPDATE: OK - I understand the comments - if your text in the database table already contains characters like <, > or &, then my current solution will in fact encode those into <, >, and &.
If you have a problem with that XML encoding - then yes, you must look at the solution proposed by #KM which works for those characters, too. One word of warning from me: this approach is a lot more resource and processing intensive - just so you know.
DECLARE #CodeNameString varchar(max)
SET #CodeNameString=''
SELECT #CodeNameString=#CodeNameString+CodeName FROM AccountCodes ORDER BY Sort
SELECT #CodeNameString
#AlexanderMP's answer is correct, but you can also consider handling nulls with coalesce:
declare #CodeNameString nvarchar(max)
set #CodeNameString = null
SELECT #CodeNameString = Coalesce(#CodeNameString + ', ', '') + cast(CodeName as varchar) from AccountCodes
select #CodeNameString
For SQL Server 2005 and above use Coalesce for nulls and I am using Cast or Convert if there are numeric values -
declare #CodeNameString nvarchar(max)
select #CodeNameString = COALESCE(#CodeNameString + ',', '') + Cast(CodeName as varchar) from AccountCodes ORDER BY Sort
select #CodeNameString
from msdn Do not use a variable in a SELECT statement to concatenate values (that is, to compute aggregate values). Unexpected query results may occur. This is because all expressions in the SELECT list (including assignments) are not guaranteed to be executed exactly once for each output row
The above seems to say that concatenation as done above is not valid as the assignment might be done more times than there are rows returned by the select
Here is another real life example that works fine at least with 2008 release (and later).
This is the original query which uses simple max() to get at least one of the values:
SELECT option_name, Field_M3_name, max(Option_value) AS "Option value", max(Sorting) AS "Sorted"
FROM Value_list group by Option_name, Field_M3_name
ORDER BY option_name, Field_M3_name
Improved version, where the main improvement is that we show all values comma separated:
SELECT from1.keys, from1.option_name, from1.Field_M3_name,
Stuff((SELECT DISTINCT ', ' + [Option_value] FROM Value_list from2
WHERE COALESCE(from2.Option_name,'') + '|' + COALESCE(from2.Field_M3_name,'') = from1.keys FOR XML PATH(''),TYPE)
.value('text()[1]','nvarchar(max)'),1,2,N'') AS "Option values",
Stuff((SELECT DISTINCT ', ' + CAST([Sorting] AS VARCHAR) FROM Value_list from2
WHERE COALESCE(from2.Option_name,'') + '|' + COALESCE(from2.Field_M3_name,'') = from1.keys FOR XML PATH(''),TYPE)
.value('text()[1]','nvarchar(max)'),1,2,N'') AS "Sorting"
FROM ((SELECT DISTINCT COALESCE(Option_name,'') + '|' + COALESCE(Field_M3_name,'') AS keys, Option_name, Field_M3_name FROM Value_list)
-- WHERE
) from1
ORDER BY keys
Note that we have solved all possible NULL case issues that I can think of and also we fixed an error that we got for numeric values (field Sorting).

Use ampersand in CAST in SQL

The following code snippet on SQL server 2005 fails on the ampersand '&':
select cast('<name>Spolsky & Atwood</name>' as xml)
Does anyone know a workaround?
Longer explanation, I need to update some data in an XML column, and I'm using a search & replace type hack by casting the XML value to a varchar, doing the replace and updating the XML column with this cast.
select cast('<name>Spolsky & Atwood</name>' as xml)
A literal ampersand inside an XML tag is not allowed by the XML standard, and such a document will fail to parse by any XML parser.
An XMLSerializer() will output the ampersand HTML-encoded.
The following code:
using System.Xml.Serialization;
namespace xml
{
public class MyData
{
public string name = "Spolsky & Atwood";
}
class Program
{
static void Main(string[] args)
{
new XmlSerializer(typeof(MyData)).Serialize(System.Console.Out, new MyData());
}
}
}
will output the following:
<?xml version="1.0" encoding="utf-8"?>
<MyData
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<name>Spolsky & Atwood</name>
</MyData>
, with an & instead of &.
It's not valid XML. Use &:
select cast('<name>Spolsky & Atwood</name>' as xml)
You'd need to XML escape the text, too.
So let's backtrack and assume you're building that string as:
SELECT '<name>' + MyColumn + '</name>' FROM MyTable
you'd want to do something more like:
SELECT '<name>' + REPLACE( MyColumn, '&', '&' ) + '</name>' FROM MyTable
Of course, you probable should cater for the other entities thus:
SELECT '<name>' + REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( MyColumn, '&', '&' ), '''', '&apos;' ), '"', '"' ), '<', '<' ), '>', '>' ) + '</name>' FROM MyTable
When working with XML in SQL you're a lot safer using built-in functions instead of converting it manually.
The following code will build a proper SQL XML variable that looks like your desired output based on a raw string:
DECLARE #ExampleString nvarchar(40)
, #ExampleXml xml
SELECT #ExampleString = N'Spolsky & Atwood'
SELECT #ExampleXml =
(
SELECT 'Spolsky & Atwood' AS 'name'
FOR XML PATH (''), TYPE
)
SELECT #ExampleString , #ExampleXml
As John and Quassnoi state, & on it's own is not valid. This is because the ampersand character is the start of a character entity - used to specify characters that cannot be represented literally. There are two forms of entity - one specifies the character by name (e.g., &, or "), and one the specifies the character by it's code (I believe it's the code position within the Unicode character set, but not sure. e.g., " should represent a double quote).
Thus, to include a literal & in a HTML document, you must specify it's entity: &. Other common ones you may encounter are < for <, > for >, and " for ".