Are SQL strings null terminated? - sql

I was learning how to use len() function. When I found out the length of a cell having 12 characters, it gave me result 12. Now I was thinking that arent the SQL strings null terminated(As if they would have been then len() should have returned 13 not 12)?
Please help me out.
Thanks

Well, first - the len function does not depend on null termination, programming languages not using null termination ALSO have a len function an it works.
Thus, a len function in SQL will give you the length of the string AS THE SERVER STORES IT - what do you care how that works?
Actually it will likely not be null terminated as this would make it hard to split a string over multiple database pages. And even if - this would be seriously implementation dependent (and you don't say which product you mean - the SQL language says nothing about how the server internally stores strings).
So, at the end your question is totally irrelevant. All that is relevant is that the len function implementation is compatible with the internal storage.

In SQL Server, LEN will ignore trailing spaces (http://msdn.microsoft.com/en-us/library/ms187403.aspx) - here's a modified example from that link:
PRINT 'Testing with ANSI_PADDING ON'
SET ANSI_PADDING ON ;
GO
CREATE TABLE t1
(
charcol CHAR(16) NULL
,varcharcol VARCHAR(16) NULL
,varbinarycol VARBINARY(8)
) ;
GO
INSERT INTO t1
VALUES ('No blanks', 'No blanks', 0x00ee) ;
INSERT INTO t1
VALUES ('Trailing blank ', 'Trailing blank ', 0x00ee00) ;
SELECT 'CHAR' = '>' + charcol + '<'
,'VARCHAR' = '>' + varcharcol + '<'
,varbinarycol
,LEN(charcol)
,LEN(varcharcol)
,DATALENGTH(charcol)
,DATALENGTH(varcharcol)
FROM t1 ;
GO
PRINT 'Testing with ANSI_PADDING OFF' ;
SET ANSI_PADDING OFF ;
GO
CREATE TABLE t2
(
charcol CHAR(16) NULL
,varcharcol VARCHAR(16) NULL
,varbinarycol VARBINARY(8)
) ;
GO
INSERT INTO t2
VALUES ('No blanks', 'No blanks', 0x00ee) ;
INSERT INTO t2
VALUES ('Trailing blank ', 'Trailing blank ', 0x00ee00) ;
SELECT 'CHAR' = '>' + charcol + '<'
,'VARCHAR' = '>' + varcharcol + '<'
,varbinarycol
,LEN(charcol)
,LEN(varcharcol)
,DATALENGTH(charcol)
,DATALENGTH(varcharcol)
FROM t2 ;
GO
DROP TABLE t1
DROP TABLE t2

If we're talking pure SQL, there's no NUL terminator for you to worry about. If we're talking interfacing to SQL from other languages (e.g. C), then the answer depends on the language in question.
There are a couple of relevant points worth remembering:
There are two character types in SQL: CHAR(N) and VARCHAR(N). The former is always the same length (N) and padded with spaces; the latter is variable-length (up to N chars).
In Transact-SQL, LEN returns the length on the string excluding trailing spaces.

In some SQL strings are zero- terminated.
On some SQL they have a leading length byte/ word.
This of course does not matter for the len() function.
But it does matter if you want to insert a \0 into the string.
Usually varchar has a leading length byte.
But char is \0 terminated.

Related

� IN SQL Server database

in my database I have this char �. I want to locate them with a query
Select *
from Sometable
where somecolumn like '%�%'
this gets me no result.
I think it is ANSI encoding
use N like below
where col like N'%�%'
why do you think ,you need N prefix:
Prefix Unicode character string constants with the letter N. Without the N prefix, the string is converted to the default code page of the database. This default code page may not recognize certain characters.
Thanks to Martin Smith,Earlier i tested only with one character earlier and it worked,but as Martin pointed out, it returns all characters..
Below query works and returns only intended
select * from #demo where id like N'%�%'
COLLATE Latin1_General_100_BIN
Demo:
create table #demo
(
id nvarchar(max)
)
insert into #demo
values
(N'ﬗ'),
( N'�')
to know more about unicode,please see below links
http://kunststube.net/encoding/
https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
This is the Unicode replacement character symbol.
It could match any of 2,048 invalid code points in the UCS-2 encoding (or the single character U+FFFD for the symbol itself).
You can use a range and a binary collate clause to match them all (demo).
WITH T(N)
AS
(
SELECT TOP 65536 NCHAR(ROW_NUMBER() OVER (ORDER BY ##SPID))
FROM master..spt_values v1,
master..spt_values v2
)
SELECT N
FROM T
WHERE N LIKE '%[' + NCHAR(65533) + NCHAR(55296) + '-' + NCHAR(57343) + ']%' COLLATE Latin1_General_100_BIN
You can use ASCII to find out the ascii code for that char
Select ascii('�')
And use CHAR to retrieve the char from that code and combine it in a LIKE expression
Select * from Sometable
where somecolumn like '%'+CHAR(63)+'%'
Note the collation you use can affect the result. Also it depends on the encoding used by your application to feed your data (UTF-8, UNICODE, etc). also how you store it VARCHAR, or NVARCHAR has a last say on what you see.
There's more here in this similar question
EDIT
#Mark
try this simple test:
create table sometable(somecolumn nvarchar(100) not null)
GO
insert into sometable
values
('12345')
,('123�45')
,('12345')
GO
select * from sometable
where somecolumn like '%'+CHAR(63)+'%'
GO
This only means that character was stored win the as a "?" in this test.
When you see a � it means the app where you are seeing isn't quite sure what to print out.
It also mean OP probably needs to find out what char is that using a query.
Also note it means a string outputted like ��� can be 3 formed by different characters.
CHAR(63) was just an example, but you are right this in the ASCII table will be a standard interrogation.
EDIT
#Bridge
Not with time right now to deep dig in it but the below test don't worked
Select ascii('�'), CHAR(ascii('�')), UNICODE(N'�'), CHAR(UNICODE(N'�'))
GO
create table sometable(somecolumn nvarchar(100) not null)
GO
insert into sometable
values
('12345')
,('123�45')
,('12345')
,('12'+NCHAR(UNICODE(N'�'))+'345')
GO
select * from sometable
where somecolumn like '%'+CHAR(63)+'%'
select * from sometable
where somecolumn like '%'+NCHAR(UNICODE(N'�'))+'%'
GO

Understanding the Syntax for COALESCE

I am trying to understand the below syntax. Can I get some help with this.
DECLARE #StringList VARCHAR(2500);
SELECT COALESCE(#StringList + ',','') + CAST(apID as VARCHAR) AS ApIdList FROM testTable
As a result you will get all apID from testTable in VARCHAR
COALESCE
checks if first parameter is NULL then the second parameter will returned. In this line #StringList is always equals NULL
COALESCE(#StringList + ',','')
So, NULL + ',' = NULL and you will get empty string ('')
Then empty string + CAST(apID as VARCHAR) and you will get apID as VARCHAR
Coalesce returns the first non-null element provided in the list supplied. See - https://msdn.microsoft.com/en-us/library/ms190349.aspx.
In your case, if #StringList is not null, then it's contents will be prepended with a comma to appID for each row in testTable.
Your code is returning ApID as a string for all rows in the table. Why? Because #StringList is NULL so the first expression evaluates to '' and the second to a string representation of the ApId in some row in the table.
I caution you about the conversion to VARCHAR with no length. Don't do this! The default length varies by context, and you can introduce very hard-to-debug errors without a length.
A related expression is more common, I think:
SELECT #StringList = COALESCE(#StringList + ',', '') + CAST(apID as VARCHAR(8000)) AS ApIdList
FROM testTable;
This does string concatenation, so all the values of apID are concatenated together in a comma-delimited string.
What this is doing is looping on the result set to assign the variable. This type of assignment of a variable across multiple rows is discouraged. I don't think that SQL Server guarantees that it actually works (i.e. is a documented feature), but it appears to work in practice across all versions.

Create rule to restrict special characters in table in sql server

I want to create a rule to restrict special characters to be entered into a column.
I have tried the following. But it didnt work.
CREATE RULE rule_spchar
AS
#make LIKE '%[^[^*|\":<>[]{}`\( );#&$]+$]%'
I dont know what I am doing wrong here. Any help would be appreciated.
Your can create a Check Constraint on this column and only allow Numbersand Alphabets to be inserted in this column, see below:
Check Constraint to only Allow Numbers & Alphabets
ALTER TABLE Table_Name
ADD CONSTRAINT ck_No_Special_Characters
CHECK (Column_Name NOT LIKE '%[^A-Z0-9]%')
Check Constraint to only Allow Numbers
ALTER TABLE Table_Name
ADD CONSTRAINT ck_Only_Numbers
CHECK (Column_Name NOT LIKE '%[^0-9]%')
Check Constraint to only Allow Alphabets
ALTER TABLE Table_Name
ADD CONSTRAINT ck_Only_Alphabets
CHECK (Column_Name NOT LIKE '%[^A-Z]%')
It's important to remember Microsoft's plans for the features you're using or intending to use. CREATE RULE is a deprecated feature that won't be around for long. Consider using CHECK CONSTRAINT instead.
Also, since the character exclusion class doesn't actually operate like a RegEx, trying to exclude brackets [] is impossible this way without multiple calls to LIKE. So collating to an accent-insensitive collation and using an alphanumeric inclusive filter will be more successful. More work required for non-latin alphabets.
M.Ali's NOT LIKE '%[^A-Z0-9 ]%' Should serve well.
M.Ali's answer represents the best practice for the solution you describe. That being said, I read your question differently(i.e What is wrong with they way you're implementing the like comparison.)
You are not properly escaping wildcard characters.
The expression 'AB' LIKE '%[AB]% is true. The expression 'ZB' LIKE '%[^AB]%' is also true, since that statement is the equivalent of 'Z' LIKE '[^AB]' OR 'A' LIKE '[^AB]' Instead, you should use 'YZ' NOT LIKE '%[^AB]%' which is the equivalent of 'Y' NOT LIKE '%[^AB]%' AND 'Z' NOT LIKE '%[^AB]%'
You didn't escape the single quote or invisible characters. Take a look at the the ASCII characters. You would be better served implementing a solution like M.Ali's and adding any characters you do not wish to exclude.
The following script demonstrates the formation of a complex wildcard statement that consists of special characters.
-- Create sample data
-- Experiment testing various characters
DECLARE #temp TABLE (id INT NOT NULL, string1 varchar(10) NOT NULL)
INSERT INTO #temp
(id,string1)
SELECT 1, '12]34'
UNION
SELECT 2, '12[34'
UNION
SELECT 3, '12_34'
UNION
SELECT 4, '12%34'
UNION
SELECT 5, '12]34'
SET NOCOUNT ON
DECLARE #SQL_Wildcard_Characters VARCHAR(512),
#Count_SQL_Wildcard_Characters INT,
#Other_Special_Characters VARCHAR(255),
#Character_Position INT,
#Escape_Character CHAR(1),
#Complete_Wildcard_Expression VARCHAR(1024)
SET #Character_Position = 1
-- Note these need to be escaped:
SET #SQL_Wildcard_Characters = '[]^%_'
-- Choose an escape character.
SET #Escape_Character = '~'
-- I added the single quote (') ASCII 39 and the space ( ) ASCII 32.
-- You could also add the actual characters, but this approach may make it easier to read.
SET #Other_Special_Characters = '*|\":<>{}`\();#&$' + CHAR(39) + CHAR(32)
-- Quick loop to escape the #SQL_Wildcard_Characters
SET #Count_SQL_Wildcard_Characters = LEN(#SQL_Wildcard_Characters)
WHILE #Character_Position < 2*#Count_SQL_Wildcard_Characters
BEGIN
SET #SQL_Wildcard_Characters = STUFF(#SQL_Wildcard_Characters,#Character_Position,0,#Escape_Character)
SET #Character_Position = #Character_Position + 2
END
-- Concatenate the respective strings
SET #Complete_Wildcard_Expression = #SQL_Wildcard_Characters+#Other_Special_Characters
-- Shows how the statment works for match
SELECT ID, string1, #Complete_Wildcard_Expression AS [expression]
FROM #temp
WHERE string1 LIKE '%['+#Complete_Wildcard_Expression+']%' ESCAPE #Escape_Character
-- Show how the statement works fo non-match
SELECT ID, string1, #Complete_Wildcard_Expression AS [expression]
FROM #temp
WHERE string1 NOT LIKE '%[^'+#Complete_Wildcard_Expression+']%' ESCAPE #Escape_Character
CREATE FUNCTION udf_checkspecial_characters(#String varchar(MAX))
RETURNS INT AS
BEGIN
DECLARE #Result INT;
SELECT #Result=(CASE WHEN #String COLLATE Latin1_General_BIN LIKE '%[(<~!#/#$%^&>)]%' THEN 1 ELSE 0 END);
RETURN #Result;
END

Concatenate sql values to a variable

On a SQL Server 2008 I'm trying to get a comma separated list of all selected values into a variable.
SELECT field
FROM table
returns:
+-------+
| field |
+-------+
| foo |
+-------+
| bar |
+-------+
I'd like to get:
"foo, bar, "
I tried:
DECLARE #foo NVARCHAR(MAX)
SET #foo = ''
SELECT #foo = #foo + field + ','
FROM TABLE
PRINT #foo
Which returns nothing. What am I doing wrong?
You'll need to change NULLs
SELECT #foo = #foo + ISNULL(field + ',', '')
FROM TABLE
or remove them
SELECT #foo = #foo + field + ','
FROM TABLE
WHERE field IS NOT NULL
That happens if you have even a SINGLE field in the table that is NULL. In SQL Server, NULL + <any> = NULL. Either omit them
SELECT #foo = #foo + field + ','
FROM TABLE
WHERE field is not null
Or work around them
SELECT #foo = #foo + isnull(field + ',', '')
FROM TABLE
You can write the whole thing without the leading SET statement which is more common. This query below returns "foo,bar" with no trailing comma
DECLARE #foo NVARCHAR(MAX)
SELECT #foo = isnull(#foo + ',', '') + field
FROM TABLE
WHERE field is not null
PRINT #foo
Don't forget to use LTRIM and RTRIM around #foo (when data type is char/varchar) in the concatenation other it will not give expected results in SQL 2008 R2.
As per the comment Lukasz Szozda made on one of the answers here, you should not use your indicated method to aggregate string values in SQL Server, as this is not supported functionality. While this tends to work when no order clause is used (and even if no exception to this tendency has ever been documented), Microsoft does not guarantee that this will work, and there's always a chance it could stop working in the future. SQL is a declarative language; you cannot assume that behaviour that is not explicitly defined as being the correct behaviour for interpreting a given statement will continue working.
Instead, see the examples below, or see this page for a review of valid ways to achieve the same result, and their respective performance: Optimal way to concatenate/aggregate strings
Doing it in a valid way, whichever way you end up using, still has the same considerations as in the other answers here. You either need to exclude NULL values from your result set or be explicit about how you want them to be added to the resulting string.
Further, you should probably use some kind of explicit ordering so that this code is deterministic - it can cause all sorts of problems down the line if code like this can produce a different result when running on the same data, which may happen without an explicit ordering specified.
--Null values treated as empty strings
SET #Foo =
STUFF /*Stuff is used to remove the seperator from the start of the string*/
( (SELECT N','/*separator*/ + ISNULL(RTRIM(t.Field), '' /*Use an emptry string in the place of NULL values*/) /*Thing to List*/
FROM TABLE t
ORDER BY t.SomeUniqueColumn ASC /*Make the query deterministic*/
FOR XML PATH, TYPE).value(N'.[1]',N'varchar(max)')
,1
,1 /*Length of separator*/
,N'');
--Null values excluded from result
SET #Foo =
STUFF /*Stuff is used to remove the seperator from the start of the string*/
( (SELECT N','/*separator*/ + RTRIM(t.Field) /*Thing to List*/
FROM TABLE t
WHERE t.Field IS NOT NULL
ORDER BY t.SomeUniqueColumn ASC /*Make the query deterministic*/
FOR XML PATH, TYPE).value(N'.[1]',N'varchar(max)')
,1
,1 /*Length of separator*/
,N'');

Insert a trailing space into a SQL Server VARCHAR column

I'm trying to insert trailing spaces into a VARCHAR(50) column and the SQL insert seems to be cutting them off. Here's my code:
create table #temp (field varchar(10));
insert into #temp select ' ';
select LEN(field) from #temp;
Unfortunately, this returns a length of zero, meaning the ' ' was inserted as a ''. I need a blank space to be inserted for this column - any ideas?
Use DATALENGTH, not LEN, because LEN doesn't process spaces.
Take this zero length string, for example:
SELECT LEN(' ') AS len,
DATALENGTH(' ') AS datalength
Results:
len datalength
-----------------
0 1
From http://msdn.microsoft.com/en-us/library/ms190329.aspx:
LEN Returns the number of characters of the specified string expression, excluding trailing blanks.
You better be aware also that SQL Server follows ANSI/ISO SQL-92 padding the character strings used in comparisons so that their lengths match before comparing them. So, you may want to use LIKE predicate for comparisons [1]
[1]
How SQL Server Compares Strings with Trailing Spaces
http://support.microsoft.com/kb/316626