In Below example i Want to know charindex of first character except spaces tabs newlines etc.
I am not able able to do that because CHARINDEX() function of SQL want character for index
but in my string anyone comes dynamically.
Declare #str varchar(100)
set #str=' test String'
in above case i want charindex of 't' (means first character of string)
set #str=' String test'
in above case i want charindex of 'S' (means first character of string)
Anyone please suggest me the solution.
The best way would be to come up with some kind of a regex. You can also have carriage return + line feed (linebreak) and tab -characters, which won't show correctly unless you do something like this:
DECLARE #str VARCHAR(100)
SET #str=CHAR(9)+' '+CHAR(13)+CHAR(10)+'test String'
SELECT CHARINDEX(LTRIM(REPLACE(REPLACE(REPLACE(#str,CHAR(13),' '),CHAR(10),' '),CHAR(9),' ')), #str);
SELECT SUBSTRING(#str, CHARINDEX(LTRIM(REPLACE(REPLACE(REPLACE(#str,CHAR(13),' '),CHAR(10),' '),CHAR(9),' ')), #str), 1)
The characters are as follows:
CHAR(13) = carriage return
CHAR(10) = linefeed
CHAR(13) + CHAR(10) = standard newline characters
CHAR(9) = TAB
Use Collation for Case sensitive.
Select CHARINDEX ( 'S',#str COLLATE Latin1_General_CS_AS, 1 )
You could use PATINDEX to find the position of a character that is not one of a specific subset of excluded characters:
PATINDEX('%[^list of excluded characters]%', #str)
In your case, the excluded character list would consist of CHAR(32) (space), CHAR(9) (tab), CHAR(13) (carriage return), CHAR(10) (linefeed) and whatever else you mean by the etc.. Here is an example:
DECLARE #str varchar(100);
SET #str = '
test string';
SELECT PATINDEX('%[^' + CHAR(32) + CHAR(9) + CHAR(13) + CHAR(10) + ']%', #str);
The #str in the above example begins with a newline (CHAR(13) + CHAR(10)) followed by two spaces. Therefore, the output of the SELECT statement would be this:
----------
4
Related
How to check if any of this !##$%^&*()_-+ special characters exist in a string ?
i tried
SELECT PATINDEX('!##$%^&*()_-+', 'test-');
SELECT PATINDEX('[!##$%^&*()_-+]', 'test-');
SELECT PATINDEX('%[!##$%^&*()_-+]%', 'test-');
but all returns 0, it should return 5, any help ?
The - is a special character in the LIKE or PATINDEX() pattern. If it is anywhere other than the first position, it is a range of characters -- such as all digits being represented by [0-9].
You can do what you want by moving the condition:
PATINDEX('%[-!##$%^&*()_+]%', 'test-'),
Unfortunately, PATINDEX() patterns don't support an escape character. You can also express this logic as a LIKE and CASE:
(CASE WHEN 'test-' LIKE '%[-!##$%^&*()_+]%' ESCAPE '$' THEN 1 ELSE 0 END)
Or using a "not" pattern:
(CASE WHEN 'test-' NOT LIKE '%[^0-9a-zA-Z]%' THEN 0 ELSE 1 END)
You can use negation:
SELECT PATINDEX('%[^a-Z]%', 'test-');
This will find a character NOT in the range a-Z.
SELECT PATINDEX('%[-+!_#()*^#$%&]%', 'test-');
this solves my issue returns 5 the positon of -.
Apperently order matters.
DECLARE #myString VARCHAR(100) ='test-'
IF (#myString LIKE '%[^a-zA-Z0-9]%')
PRINT 'Contains "special" characters'
ELSE
PRINT 'Does not contain "special" characters'
select patindex('%[' + char(45) + ']%', 'test-');
But symbol '-' not working in range values, replace this symbol on another symbol
Next example find only writable symbols in #string:
declare #string varchar(max) = char(8) + char(9) + char(10) + char(11) + char(12) + char(13) + '-test string-';
select patindex(%[' + char(33) + '-' + char(255) + ']%', replace(#string, '-', '#'))
I try to search on a string like Dhaka is the capital of Bangladesh which contain six words. If my search text is cap (which is the starting text of capital), it will give me the starting index of the search text in the string (14 here). And if the search text contain in the string but not starting text any of the word, it will give me 0. Please take a look at the Test Case for better understanding.
What I tried
DECLARE #SearchText VARCHAR(20),
#Str VARCHAR(MAX),
#Result INT
SET #Str = 'Dhaka is the capital of Bangladesh'
SET #SearchText = 'cap'
SET #Result = CASE WHEN #Str LIKE #SearchText + '%'
OR #Str LIKE + '% ' + #SearchText + '%'
THEN CHARINDEX(#SearchText, #Str)
ELSE 0 END
PRINT #Result -- print 14 here
For my case, I need to generate #Str with another sql function. Here, we need to generate #Str 3 times which is costly (I think). So, is there any way so that I need generate #Str only one time? [Is that possible by using PATINDEX]
Note: CASE condition appear in the where clause at my original query. So, It is not possible to set the #Str value in variable then use it in the where clause.
Test Case
Search Text: Dhaka, Result: 1
Search Text: tal, Result: 0
Search Text: Mirpur, Result: 0
Search Text: isthe, Result: 0
Search Text: is the, Result: 7
Search Text: Dhaka Capital, Result: 0
Simply add a leading space to the strings to ensure that you always find only the beginning of a word:
DECLARE #SearchText VARCHAR(20),
#Str VARCHAR(MAX),
#Result INT
SET #Str = 'Dhaka is the capital of Bangladesh'
SET #SearchText = 'Dhaka Capital'
SET #Result = CHARINDEX(' ' + #SearchText, ' ' + #Str)
PRINT #Result -- print 14 here
I have tested the above query against your test cases and it seems to work.
To compute the function only once per row in SELECT make it table valued function. Or if it's impossible for some reason use CROSS APPLY
SELECT .. a, b,
FROM ..
CROSS APPLY (SELECT my_scalar_fn(a,b) as Str) arg
WHERE CASE WHEN arg.Str LIKE SearchText + '%'
OR arg.Str LIKE + '% ' + SearchText + '%'
THEN CHARINDEX(SearchText, arg.Str)
ELSE 0 END
I have a large table of data where some of my columns contain line breaks. I would like to remove them and replace them with some spaces instead.
Can anybody tell me how to do this in SQL Server?
Thanks in advance
SELECT REPLACE(REPLACE(#str, CHAR(13), ''), CHAR(10), '')
This should work, depending on how the line breaks are encoded:
update t
set col = replace(col, '
', ' ')
where col like '%
%';
That is, in SQL Server, a string can contain a new line character.
#Gordon's answer should work, but in case you're not sure how your line breaks are encoded, you can use the ascii function to return the character value. For example:
declare #entry varchar(50) =
'Before break
after break'
declare #max int = len(#entry)
; with CTE as (
select 1 as id
, substring(#entry, 1, 1) as chrctr
, ascii(substring(#entry, 1, 1)) as code
union all
select id + 1
, substring(#entry, ID + 1, 1)
, ascii(substring(#entry, ID + 1, 1))
from CTE
where ID <= #max)
select chrctr, code from cte
print replace(replace(#entry, char(13) , ' '), char(10) , ' ')
Depending where your text is coming from, there are different encodings for a line break. In my test string I put the most common.
First I replace all CHAR(10) (Line feed) with CHAR(13) (Carriage return), then all doubled CRs to one CR and finally all CRs to the wanted replace (you want a blank, I put a dot for better visability:
Attention: Switch the output to "text", otherwise you wont see any linebreaks...
DECLARE #text VARCHAR(100)='test single 10' + CHAR(10) + 'test 13 and 10' + CHAR(13) + CHAR(10) + 'test single 13' + CHAR(13) + 'end of test';
SELECT #text
DECLARE #ReplChar CHAR='.';
SELECT REPLACE(REPLACE(REPLACE(#text,CHAR(10),CHAR(13)),CHAR(13)+CHAR(13),CHAR(13)),CHAR(13),#ReplChar);
I have the same issue, means I have a column having values with line breaks in it. I use the query
update `your_table_name` set your_column_name = REPLACE(your_column_name,'\n','')
And this resolves my issue :)
Basically '\n' is the character for Enter key or line break and in this query, I have replaced it with no space (which I want)
Keep Learning :)
zain
I have data which has leading and trailing spaces in the string. when storing that data in database I want to trim the space in query itself before storing into DB.
Normal spaces are trimming properly with RTRIM and LTRIM function but if a string contains tab space,its not trimming the tab space from the input string.
Can anyone help me to get the string with trimmed with tab space from leading and trailing.
Replace the ASCII code for tab (9):
replace(#str, char(9), '')
To only remove the outer tabs, first change them to something that won't exist in your data (I use a series of four spaces in this example), then rtrim/ltrim, then convert that same sequence back to tabs:
replace(ltrim(rtrim(replace(#str, char(9), ' '))),' ', char(9));
Try this:
DECLARE #InputString nvarchar(50) = CHAR(9) + CHAR(9) + ' 123'+ 'abc ' + CHAR(9);
SELECT #InputString AS InputString
,REVERSE(RIGHT(REVERSE(RIGHT(#InputString, LEN(#InputString) - PATINDEX('%[^'+CHAR(9)+']%', #InputString) + 1)), LEN(REVERSE(RIGHT(#InputString, LEN(#InputString) - PATINDEX('%[^'+CHAR(9)+']%', #InputString) + 1))) - PATINDEX('%[^'+CHAR(9)+']%', REVERSE(RIGHT(#InputString, LEN(#InputString) - PATINDEX('%[^'+CHAR(9)+']%', #InputString) + 1))) + 1)) AS OutputString
;
Maybe you should refactor it as a function. Note, it may works only above Sql Server 2008. You can replace CHAR(9) to any character you like to trim.
I'm dumbfounded that this question has not been asked meaningfully already. How does one go about creating an equivalent function in SQL like LTRIM or RTRIM for carriage returns and line feeds ONLY at the start or end of a string.
Obviously REPLACE(REPLACE(#MyString,char(10),''),char(13),'') removes ALL carriage returns and new line feeds. Which is NOT what I'm looking for. I just want to remove leading or trailing ones.
Find the first character that is not CHAR(13) or CHAR(10) and subtract its position from the string's length.
LTRIM()
SELECT RIGHT(#MyString,LEN(#MyString)-PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%',#MyString)+1)
RTRIM()
SELECT LEFT(#MyString,LEN(#MyString)-PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%',REVERSE(#MyString))+1)
Following functions are enhanced types of trim functions you can use. Copied from sqlauthority.com
These functions remove trailing spaces, leading spaces, white space, tabs, carriage returns, line feeds etc.
Trim Left
CREATE FUNCTION dbo.LTrimX(#str VARCHAR(MAX)) RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #trimchars VARCHAR(10)
SET #trimchars = CHAR(9)+CHAR(10)+CHAR(13)+CHAR(32)
IF #str LIKE '[' + #trimchars + ']%' SET #str = SUBSTRING(#str, PATINDEX('%[^' + #trimchars + ']%', #str), LEN(#str))
RETURN #str
END
Trim Right
CREATE FUNCTION dbo.RTrimX(#str VARCHAR(MAX)) RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #trimchars VARCHAR(10)
SET #trimchars = CHAR(9)+CHAR(10)+CHAR(13)+CHAR(32)
IF #str LIKE '%[' + #trimchars + ']'
SET #str = REVERSE(dbo.LTrimX(REVERSE(#str)))
RETURN #str
END
Trim both Left and Right
CREATE FUNCTION dbo.TrimX(#str VARCHAR(MAX)) RETURNS VARCHAR(MAX)
AS
BEGIN
RETURN dbo.LTrimX(dbo.RTrimX(#str))
END
Using function
SELECT dbo.TRIMX(#MyString)
If you do use these functions you might also consider changing from varchar to nvarchar to support more encodings.
In SQL Server 2017 you can use the TRIM function to remove specific characters from beginning and end, in one go:
WITH testdata(str) AS (
SELECT CHAR(13) + CHAR(10) + ' test ' + CHAR(13) + CHAR(10)
)
SELECT
str,
TRIM(CHAR(13) + CHAR(10) + CHAR(9) + ' ' FROM str) AS [trim cr/lf/tab/space],
TRIM(CHAR(13) + CHAR(10) FROM str) AS [trim cr/lf],
TRIM(' ' FROM str) AS [trim space]
FROM testdata
Result:
Note that the last example (trim space) does nothing as expected since the spaces are in the middle.
Here's an example you may run:
I decided to cast the results as an Xml value, so when you click on it, you will be able to view the Carriage Returns.
DECLARE #CRLF Char(2) = (CHAR(0x0D) + CHAR(0x0A))
DECLARE #String VarChar(MAX) = #CRLF + #CRLF + ' Hello' + #CRLF + 'World ' + #CRLF + #CRLF
--Unmodified String:
SELECT CAST(#String as Xml)[Unmodified]
--Remove Trailing Whitespace (including Spaces).
SELECT CAST(LEFT(#String, LEN(REPLACE(#String, #CRLF, ' '))) as Xml)[RemoveTrailingWhitespace]
--Remove Leading Whitespace (including Spaces).
SELECT CAST(RIGHT(#String, LEN(REVERSE(REPLACE(#String, #CRLF, ' ')))) as Xml)[RemoveLeadingWhitespace]
--Remove Leading & Trailing Whitespace (including Spaces).
SELECT CAST(SUBSTRING(#String, LEN(REPLACE(#String, ' ', '_')) - LEN(REVERSE(REPLACE(#String, #CRLF, ' '))) + 1, LEN(LTRIM(RTRIM(REPLACE(#String, #CRLF, ' '))))) as Xml)[RemoveAllWhitespace]
--Remove Only Leading and Trailing CR/LF's (while still preserving all other Whitespace - including Spaces). - 04/06/2016 - MCR.
SELECT CAST(SUBSTRING(#String, PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%',#String), LEN(REPLACE(#String, ' ', '_')) - PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%',#String) + 1 - PATINDEX('%[^'+CHAR(13)+CHAR(10)+']%', REVERSE(#String)) + 1) as Xml)[RemoveLeadingAndTrailingCRLFsOnly]
Remember to remove the Cast-to-Xml, as this was done just as a Proof-of-Concept to show it works.
How is this better than the currently Accepted Answer?
At first glance this may appear to use more Functions than the Accepted Answer.
However, this is not the case.
If you combine both approaches listed in the Accepted Answer (to remove both Trailing and Leading whitespace), you will either have to make two passes updating the Record, or copy all of one Logic into the other (everywhere #String is listed), which would cause way more function calls and become even more difficult to read.
I was stuck using Microsoft SQL Server 2008 R2 and so basing my functions on #sqluser's answer I came up with the below. This will return an empty string if the string only contains the characters to be trimmed.
The bit that threw me was the pattern for PATINDEX must be included between % characters, which for a while I was thinking of as the same wildcard in a LIKE statement but which I now believe is just the syntax to denote a pattern, though I may be wrong!
CREATE FUNCTION [dbo].[ExtendedLTRIM](#string_to_trim VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #tab CHAR(1) = CHAR(9);
DECLARE #line_feed CHAR(1) = CHAR(10);
DECLARE #carriage_return CHAR(1) = CHAR(13);
DECLARE #space CHAR(1) = CHAR(32);
DECLARE #characters_to_trim VARCHAR(10)
SET #characters_to_trim = #tab + #line_feed + #carriage_return + #space
IF #string_to_trim LIKE '[' + #characters_to_trim + ']%'
BEGIN
DECLARE #first_non_trim_character INT = PATINDEX('%[^' + #characters_to_trim + ']%', #string_to_trim);
IF #first_non_trim_character = 0 RETURN '';
RETURN SUBSTRING(#string_to_trim, #first_non_trim_character, 8000)
END
RETURN #string_to_trim
END
GO
To trim characters from a pre-defined list you'll want to create the following UDF (should work in 2008R2 and above).
Handles both sides in a single pass and doesn't care if it's a CRLF, LFCR (yep, seen that abomination more than once), bare LF or a bunch of spaces.
is easy to extend to e.g. add additional parameters to do LTRIM/RTRIM only, or a full purge (that last bit is simpler to do in 2017 by incorporating STRING_AGG, but perfectly doable in 2008R2); as a matter of fact this is a simplified version of something I use to do all those things. If anybody is interested then let me know and I'll update:
CREATE FUNCTION fnTrimHarder
(
#String VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE
#Start INT,
#Len INT,
#Chars CHAR(5) = CONCAT(
CHAR(9), -- TAB
CHAR(10), -- LF
CHAR(13), -- CR
' '
), -- List of invalid characters
#Return VARCHAR(MAX) = '';
IF #String NOT LIKE '%[^' + #Chars + ']%' -- If string contains only invalid characters
OR COALESCE(#String, '') = '' -- Optional addition for NULL handling
RETURN #Return
ELSE
BEGIN -- Create a "table" of characters with ordinals, calculate the start of string and its length, then return the substring
WITH CTE AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1
FROM CTE
WHERE n < LEN(#String)
)
SELECT
#Start = MIN(n),
#Len = 1 + MAX(n) - MIN(n)
FROM CTE
WHERE SUBSTRING(#String, n, 1) NOT LIKE '[' + #Chars + ']';
SET #Return = SUBSTRING(#String, #Start, #Len)
END
RETURN #Return
END
GO