Delete blank spaces in a string column SQL Server - sql

I have sql table with the following values:
'test1 ', 'test2 '.
I need to delete all blank spaces in a string.
It looks easy but TRIM, LTRIM, RTRIM or REPLACE(column,' ','') does not work.
LEN() function count that space as a character.
Lenght of value 'test1 ' is 6.
In which way can I select that column without that blank space?
I need value 'test1'.

The minimal reproducible example is not provided.
Please try the following solution.
SQL
USE tempdb;
GO
DROP FUNCTION IF EXISTS dbo.udf_tokenize;
GO
/*
1. All invisible TAB, Carriage Return, and Line Feed characters will be replaced with spaces.
2. Then leading and trailing spaces are removed from the value.
3. Further, contiguous occurrences of more than one space will be replaced with a single space.
*/
CREATE FUNCTION dbo.udf_tokenize(#input VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
RETURN (SELECT CAST('<r><![CDATA[' + #input + ' ' + ']]></r>' AS XML).value('(/r/text())[1] cast as xs:token?','VARCHAR(MAX)'));
END
GO
-- DDL and sample data population, start
DECLARE #mockTbl TABLE (ID INT IDENTITY(1,1), col_1 VARCHAR(100), col_2 VARCHAR(100));
INSERT INTO #mockTbl (col_1, col_2)
VALUES (' FL ', ' Miami')
, (' FL ', ' Fort Lauderdale ')
, (' NY ', ' New York ')
, (' NY ', '')
, (' NY ', NULL);
-- DDL and sample data population, end
-- before
SELECT * FROM #mockTbl;
-- remove invisible chars
UPDATE #mockTbl
SET col_1 = dbo.udf_tokenize(col_1)
, col_2 = dbo.udf_tokenize(col_2);
-- after
SELECT *, LEN(col_2) AS [col_2_len] FROM #mockTbl;

As discovered in the comments, the character(s) at the end of your value isn't a whitespace, it's a carriage return. LTRIM and RTRIM don't remove these characters, and TRIM only removes whitespace (character 32) by default.
If you want to remove some other characters, you can use TRIM, but you need to tell it to remove said other characters, using the TRIM({Characters} FROM {String}) syntax. The below removes leading and trailing Spaces (' '), Carriage Returns (CHAR(13)) and Line Breaks (CHAR(10)):
CREATE TABLE dbo.YourTable (SomeString varchar(50));
GO
INSERT INTO dbo.YourTable (SomeString)
VALUES('Trailing Space'),
('Trailing Line Break' + CHAR(10)),
('Trailing CRLF' + CHAR(13) + CHAR(10)),
('Trailing CRLF and spaces ' + CHAR(13) + CHAR(10) + ' ');
GO
DECLARE #TrimCharacters varchar(10) = ' ' + CHAR(13) + CHAR(10);
SELECT SomeString,
LEN(SomeString) AS Len,
DATALENGTH(SomeString) AS DataLength,
TRIM(SomeString) AS Trimmed,
DATALENGTH(TRIM(SomeString)) AS TrimmedDataLength,
TRIM(#TrimCharacters FROm SomeString) AS WellTrimmed,
DATALENGTH(TRIM(#TrimCharacters FROm SomeString)) AS WellTrimmedDataLength
FROM dbo.YourTable;
GO
DROP TABLE dbo.YourTable;

You can use the function below to check if each character of the string is within the ASCII values 32 to 126 (Numbers & Alphabet).
This will remove the "character" that's not within the ASCII range.
SQL:
DECLARE #TestString varchar(10) = 'test1 '
DECLARE #Result nvarchar(max)
SET #Result = ''
DECLARE #character nvarchar(1)
DECLARE #characterposition int
SET #characterposition = 1
WHILE #characterposition <= LEN(#TestString)
BEGIN
SET #character = SUBSTRING(#TestString , #characterposition, 1)
IF ASCII(#character) >= 32 AND ASCII(#character) <= 126
SET #Result = #Result + #character
SET #characterposition = #characterposition + 1
END
SELECT #Result

Related

Trimming in SQL Server

How would i go about trimming this field '633827-9062-5000-0006 4'
to look like 633827906250000006.
Using trim functions?
The field will always be the same length & i need to remove the last character.
Thanks in advance.
Try using replace to remove the dashes and then left to get the 18 first chars.
left(replace(your_string, '-',''), 18)
In your example this would remove the white space and the 4 at the end and retain the first 18 characters.
DECLARE #value varchar(50) = '633827-9062-5000-0006 4';
SELECT REPLACE(RTRIM(LEFT(#value, LEN(#value) - 1)), '-', '')
To remove the last 2 characters (space and '4'), you could use SUBSTRING and REPLACE as in the following examples:
declare #s varchar(50);
declare #t varchar(50);
set #s = '633827-9062-5000-0006 4';
set #t = substring(#s, 1, 21)
print '[' + #t + ']';
set #t = substring(#s, 1, len(#s) - 2);
print '[' + #t + ']';
set #t = replace(substring(#s, 1, len(#s) - 2), '-', '');
print '[' + #t + ']';
In the first PRINT statement, I am hard-coding the length, and in the second, I calculate the final string length. If you wanted to remove the dashes as well, the third assignment does this as well.
If the string length could ever be less than 2 characters, STRING will complain about 'Invalid length parameter passed to the substring function.' You could use a CASE statement to protect against that.
declare #s varchar(50);
set #s = '633827-9062-5000-0006 4';
select replace(substring(#s,0,charindex(' ',#s)),'-','')

How to modify data retrieved from select in sql?

suppose i am having some unwanted characters in the data present in a column for eg name column in customers table has data like < ,is there anyway to modify such characters like '<' to blankspace while retrieving this data using select statement? This is to prevent xss scripts showing up due to old data which is having such unwanted characters
e.g:
select *
from customers
returns
Id Name Age city salary
-- ------ --- ---- ------
1 <hari 32 Ahmedabad 4000
2 Khilan 25 Delhi 5678
3 kaushik 23 Kota 234
i want <hari to be displayed as hari when this data is retrieved using select statement.How to achieve this ?
Something like...
SELECT REPLACE(REPLACE(a.name,'<', ''), '>','')
FROM ...
It may be better to write a function to remove special characters. Here the function replaces any character that does not look like a-z,A-Z(if case sensitive),0-9 and Space. You can add more if needed.
For example if you want to retain period(.) the use '[^a-zA-Z0-9 .]'
Function:
CREATE FUNCTION ufn_RemoveSpecialCharacters
(
#String VARCHAR(500),
#Exclude VARCHAR(100),
#CollapseSpaces BIT
)
RETURNS VARCHAR(500)
AS
BEGIN
DECLARE #StartString INT,
#EndString INT,
#FinalString VARCHAR(500),
#CurrentString CHAR(1),
#PreviousString CHAR(1)
SET #StartString = 1
SET #EndString = LEN(ISNULL(#String, ''))
WHILE #StartString <= #EndString
BEGIN
SET #CurrentString = SUBSTRING(#String, #StartString, 1)
SET #PreviousString = SUBSTRING(#String, #StartString-1, 1)
IF #CurrentString LIKE ISNULL(#Exclude,'[^a-zA-Z0-9 ]')
BEGIN
SET #CurrentString = ''
IF #CollapseSpaces = 1
SET #FinalString = CASE WHEN #PreviousString = CHAR(32) THEN ISNULL(#FinalString, '') ELSE ISNULL(#FinalString, '')+' ' END
END
ELSE
BEGIN
SET #FinalString = ISNULL(#FinalString, '') + #CurrentString
IF #CollapseSpaces = 1
BEGIN
SET #FinalString = REPLACE(#FinalString,' ',' ')
END
END
SET #StartString = #StartString + 1
END
--PRINT #String
RETURN LTRIM(RTRIM(#FinalString))
END
GO
Usage:
Does not collapse Spaces
SELECT dbo.ufn_RemoveSpecialCharacters('This #$%string has#$% special #$% characters and spaces))', '[^a-zA-Z0-9 ]', 0)
Collapses multiple Spaces
SELECT dbo.ufn_RemoveSpecialCharacters('This #$%string has#$% special #$% characters and spaces))', '[^a-zA-Z0-9 ]', 1)
Here is the example
SELECT Replace(Column Name, ' ', '') AS C
FROM Contacts
WHERE Replace(Column Name, ' ', '') LIKE 'whatever'
Hope this was helpful

How to get rid of double quote from column's value?

Here is the table, each column value is wrapped with double quotes (").
Name Number Address Phone1 Fax Value Status
"Test" "10000000" "AB" "5555" "555" "555" "Active"
How to remove double quote from each column? I tried this for each column:-
UPDATE Table
SET Name = substring(Name,1,len(Name)-1)
where substring(Name,len(Name),1) = '"'
but looking for more reliable solution. This fails if any column has trailing white space
Just use REPLACE?
...
SET Name = REPLACE(Name,'"', '')
...
UPDATE Table
SET Name = REPLACE(Name, '"', '')
WHERE CHARINDEX('"', Name) <> 0
create table #t
(
Name varchar(100)
)
insert into #t(Name)values('"deded"')
Select * from #t
update #t Set Name = Coalesce(REPLACE(Name, '"', ''), '')
Select * from #t
drop table #t
Quick and Dirty, but it will work :-)
You could expand and write this as a store procedure taking in a table name, character you want to replace, character to replace with, Execute a String variable, etc...
DECLARE
#TABLENAME VARCHAR(50)
SELECT #TABLENAME = 'Locations'
SELECT 'Update ' + #TABLENAME + ' set ' + column_Name + ' = REPLACE(' + column_Name + ',''"'','''')'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #TABLENAME
and data_Type in ('varchar')

SQL: problem word count with len()

I am trying to count words of text that is written in a column of table. Therefor I am using the following query.
SELECT LEN(ExtractedText) -
LEN(REPLACE(ExtractedText, ' ', '')) + 1 from EDDSDBO.Document where ID='100'.
I receive a wrong result that is much to high.
On the other hand, if I copy the text directly into the statement then it works, i.e.
SELECT LEN('blablabla text') - LEN(REPLACE('blablabla text', ' ', '')) + 1.
Now the datatype is nvarchar(max) since the text is very long. I have already tried to convert the column into text or ntext and to apply datalength() instead of len(). Nevertheless I obtain the same result that it does work as a string but does not work from a table.
You're counting spaces not words. That will typically yield an approximate answer.
e.g.
' this string will give an incorrect result '
Try this approach: http://www.sql-server-helper.com/functions/count-words.aspx
CREATE FUNCTION [dbo].[WordCount] ( #InputString VARCHAR(4000) )
RETURNS INT
AS
BEGIN
DECLARE #Index INT
DECLARE #Char CHAR(1)
DECLARE #PrevChar CHAR(1)
DECLARE #WordCount INT
SET #Index = 1
SET #WordCount = 0
WHILE #Index <= LEN(#InputString)
BEGIN
SET #Char = SUBSTRING(#InputString, #Index, 1)
SET #PrevChar = CASE WHEN #Index = 1 THEN ' '
ELSE SUBSTRING(#InputString, #Index - 1, 1)
END
IF #PrevChar = ' ' AND #Char != ' '
SET #WordCount = #WordCount + 1
SET #Index = #Index + 1
END
RETURN #WordCount
END
GO
usage
DECLARE #String VARCHAR(4000)
SET #String = 'Health Insurance is an insurance against expenses incurred through illness of the insured.'
SELECT [dbo].[WordCount] ( #String )
Leading spaces, trailing spaces, two or more spaces between the neighbouring words – these are the likely causes of the wrong results you are getting.
The functions LTRIM() and RTRIM() can help you eliminate the first two issues. As for the third one, you can use REPLACE(ExtractedText, ' ', ' ') to replace double spaces with single ones, but I'm not sure if you do not have triple ones (in which case you'd need to repeat the replacing).
UPDATE
Here's a UDF that uses CTEs and ranking to eliminate extra spaces and then counts the remaining ones to return the quantity as the number of words:
CREATE FUNCTION fnCountWords (#Str varchar(max))
RETURNS int
AS BEGIN
DECLARE #xml xml, #res int;
SET #Str = RTRIM(LTRIM(#Str));
WITH split AS (
SELECT
idx = number,
chr = SUBSTRING(#Str, number, 1)
FROM master..spt_values
WHERE type = 'P'
AND number BETWEEN 1 AND LEN(#Str)
),
ranked AS (
SELECT
idx,
chr,
rnk = idx - ROW_NUMBER() OVER (PARTITION BY chr ORDER BY idx)
FROM split
)
SELECT #res = COUNT(DISTINCT rnk) + 1
FROM ranked
WHERE chr = ' ';
RETURN #res;
END
With this function your query will be simply like this:
SELECT fnCountWords(ExtractedText)
FROM EDDSDBO.Document
WHERE ID='100'
UPDATE 2
The function uses one of the system tables, master..spt_values, as a tally table. The particular subset used contains only values from 0 to 2047. This means the function will not work correctly for inputs longer than 2047 characters (after trimming both leading and trailing spaces), as #t-clausen.dk has correctly noted in his comment. Therefore, a custom tally table should be used if longer input strings are possible.
Replace the spaces with something that never occur in your text like ' $!' or pick another value.
then replace all '$! ' and '$!' with nothing this way you never have more than 1 space after a word. Then use your current script. I have defined a word as a space followed by a non-space.
This is an example
DECLARE #T TABLE(COL1 NVARCHAR(2000), ID INT)
INSERT #T VALUES('A B C D', 100)
SELECT LEN(C) - LEN(REPLACE(C,' ', '')) COUNT FROM (
SELECT REPLACE(REPLACE(REPLACE(' ' + COL1, ' ', ' $!'), '$! ',''), '$!', '') C
FROM #T ) A
Here is a recursive solution
DECLARE #T TABLE(COL1 NVARCHAR(2000), ID INT)
INSERT #T VALUES('A B C D', 100)
INSERT #T VALUES('have a nice day with 7 words', 100)
;WITH CTE AS
(
SELECT 1 words, col1 c, col1 FROM #t WHERE id = 100
UNION ALL
SELECT words +1, right(c, len(c) - patindex('% [^ ]%', c)), col1 FROM cte
WHERE patindex('% [^ ]%', c) > 0
)
SELECT words, col1 FROM cte WHERE patindex('% [^ ]%', c) = 0
You should declare the column using the varchar data type, like:
create table emp(ename varchar(22));
insert into emp values('amit');
select ename,len(ename) from emp;
output : 4

SQL method to replace repeating blanks with single blanks

Is there a more elegant way of doing this. I want to replace repeating blanks with single blanks....
declare #i int
set #i=0
while #i <= 20
begin
update myTable
set myTextColumn = replace(myTextColumn, ' ', ' ')
set #i=#i+1
end
(its sql server 2000 - but I would prefer generic SQL)
This works:
UPDATE myTable
SET myTextColumn =
REPLACE(
REPLACE(
REPLACE(myTextColumn
,' ',' '+CHAR(1)) -- CHAR(1) is unlikely to appear
,CHAR(1)+' ','')
,CHAR(1),'')
WHERE myTextColumn LIKE '% %'
Entirely set-based; no loops.
So we replace any two spaces with an unusual character and a space. If we call the unusual character X, 5 spaces become: ' X X ' and 6 spaces become ' X X X'. Then we replace 'X ' with the empty string. So 5 spaces become ' ' and 6 spaces become ' X'. Then, in case there was an even number of spaces, we remove any remaining 'X's, leaving a single space.
Here is a simple set based way that will collapse multiple spaces into a single space by applying three replaces.
DECLARE #myTable TABLE (myTextColumn VARCHAR(50))
INSERT INTO #myTable VALUES ('0Space')
INSERT INTO #myTable VALUES (' 1 Spaces 1 Spaces. ')
INSERT INTO #myTable VALUES (' 2 Spaces 2 Spaces. ')
INSERT INTO #myTable VALUES (' 3 Spaces 3 Spaces. ')
INSERT INTO #myTable VALUES (' 4 Spaces 4 Spaces. ')
INSERT INTO #myTable VALUES (' 5 Spaces 5 Spaces. ')
INSERT INTO #myTable VALUES (' 6 Spaces 6 Spaces. ')
select replace(
replace(
replace(
LTrim(RTrim(myTextColumn)), ---Trim the field
' ',' |'), ---Mark double spaces
'| ',''), ---Delete double spaces offset by 1
'|','') ---Tidy up
AS SingleSpaceTextColumn
from #myTable
Your Update statement can now be set based:
update #myTable
set myTextColumn = replace(
replace(
replace(
LTrim(RTrim(myTextColumn)),
' ',' |'),
'| ',''),
'|','')
Use an appropriate Where clause to limit the Update to only the rows that have you need to update or maybe have double spaces.
Example:
where 1<=Patindex('% %', myTextColumn)
I have found an external write up on this method: REPLACE Multiple Spaces with One
select
string = replace(
replace(
replace(' select single spaces',' ','<>')
,'><','')
,'<>',' ')
Replace duplicate spaces with a single space in T-SQL
SELECT 'starting...' --sets ##rowcount
WHILE ##rowcount <> 0
update myTable
set myTextColumn = replace(myTextColumn, ' ', ' ')
where myTextColumn like '% %'
Not very SET Based but a simple WHILE would do the trick.
CREATE TABLE #myTable (myTextColumn VARCHAR(32))
INSERT INTO #myTable VALUES ('NoSpace')
INSERT INTO #myTable VALUES ('One Space')
INSERT INTO #myTable VALUES ('Two Spaces')
INSERT INTO #myTable VALUES ('Multiple Spaces .')
WHILE EXISTS (SELECT * FROM #myTable WHERE myTextColumn LIKE '% %')
UPDATE #myTable
SET myTextColumn = REPLACE(myTextColumn, ' ', ' ')
WHERE myTextColumn LIKE '% %'
SELECT * FROM #myTable
DROP TABLE #myTable
Step through the characters one by one, and maintain a record of the previous character. If the current character is a space, and the last character is a space, stuff it.
CREATE FUNCTION [dbo].[fnRemoveExtraSpaces] (#Number AS varchar(1000))
Returns Varchar(1000)
As
Begin
Declare #n int -- Length of counter
Declare #old char(1)
Set #n = 1
--Begin Loop of field value
While #n <=Len (#Number)
BEGIN
If Substring(#Number, #n, 1) = ' ' AND #old = ' '
BEGIN
Select #Number = Stuff( #Number , #n , 1 , '' )
END
Else
BEGIN
SET #old = Substring(#Number, #n, 1)
Set #n = #n + 1
END
END
Return #number
END
GO
select [dbo].[fnRemoveExtraSpaces]('xxx xxx xxx xxx')
Here is a Simplest solution :)
update myTable
set myTextColumn = replace(replace(replace(LTrim(RTrim(myTextColumn )),' ','<>'),'><',''),'<>',' ')
create table blank(
field_blank char(100))
insert into blank values('yyy yyyy')
insert into blank values('xxxx xxxx')
insert into blank values ('xxx xxx')
insert into blank values ('zzzzzz zzzzz')
update blank
set field_blank = substring(field_blank,1,charindex(' ',field_blank)-1) + ' ' + ltrim(substring(field_blank,charindex(' ',field_blank) + 1,len(field_blank)))
where CHARINDEX (' ' , rtrim(field_blank)) > 1
select * from blank
For me the above examples almost did a trick but I needed something that was more stable and independent of the table or column or a set number of iterations. So this is my modification from most of the above queries.
CREATE FUNCTION udfReplaceAll
(
#OriginalText NVARCHAR(MAX),
#OldText NVARCHAR(MAX),
#NewText NVARCHAR(MAX)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
WHILE (#OriginalText LIKE '%' + #OldText + '%')
BEGIN
SET #OriginalText = REPLACE(#OriginalText,#OldText,#NewText)
END
RETURN #OriginalText
END
GO
Lets say, your Data like this
Table name : userdata Field: id, comment, status,
id, "I love -- -- - -spaces -- - my INDIA" , "Active" <br>
id, "I love -- -- - -spaces -- - my INDIA" , "Active" <br>
id, "I love -- -- - -spaces -- - my INDIA" , "Active" <br>
id, "I love -- -- - -spaces -- - my INDIA" , "Active" <br>
So just do like this
update userdata set comment=REPLACE(REPLACE(comment," ","-SPACEHERE-"),"-SPACEHERE"," ");
I didn't tested , but i think this will work.
Try this:
UPDATE Ships
SET name = REPLACE(REPLACE(REPLACE(name, ' ', ' ' + CHAR(1)), CHAR(1) + ' ', ''), CHAR(1), '')
WHERE name LIKE '% %'
REPLACE(REPLACE(REPLACE(myTextColumn,' ',' %'),'% ',''),'%','')
Statement above worked terrifically for replacing multiple spaces with a single space. Optionally add LTRIM and RTRIM to remove spaces at the beginning.
Got it from here: http://burnignorance.com/database-tips-and-tricks/remove-multiple-spaces-from-a-string-using-sql-server/
WHILE
(SELECT count(myIDcolumn)
from myTable where myTextColumn like '% %') > 0
BEGIN
UPDATE myTable
SET myTextColumn = REPLACE(myTextColumn ,' ',' ')
END
Try it:
CREATE OR REPLACE FUNCTION REM_SPACES (TEXTO VARCHAR(2000))
RETURNS VARCHAR(2000)
LANGUAGE SQL
READS SQL DATA
BEGIN
SET TEXTO = UPPER(LTRIM(RTRIM(TEXTO)));
WHILE LOCATE(' ',TEXTO,1) >= 1 DO
SET TEXTO = REPLACE(TEXTO,' ',' ');
END WHILE;
RETURN TEXTO;
END
Update myTable set myTextColumn = replace(myTextColumn, ' ', ' ');
The above query will remove all the double blank spaces with single blank space
But this would work only once.