Search count of words within a string using SQL - sql

Database: Sql Server
I have table column named page_text which contains following value
" I Love stackoverflow.com because i can post questions and get
answers at any time. "
Using SQLQUERY i want to search the number of I it has . So in this string it would should return 4.
I should be able to search anything.

declare #search varchar(10) = 'I'
select len(replace(PageText, #search, #search + '#'))
- len(PageText) as Count from YourTable

based on this code: http://www.sql-server-helper.com/functions/count-character.aspx
you create a function:
CREATE FUNCTION [dbo].[ufn_CountSpecificWords] ( #pInput VARCHAR(1000), #pWord VARCHAR(1000) )
RETURNS INT
BEGIN
RETURN (LEN(#pInput) - LEN(REPLACE(#pInput, ' ' + #pWord + ' ', '')))
END
GO
this however implies that you save your strings with a leading and trailing space and you replace any other separators like ',' and '.' with spaces.
and you need to refrase your question if you want the count of words or just the appeareance of a word.

SELECT (LEN(#Text) - LEN(REPLACE(#Text,#SearchString,''))/(Len(#SearchString))

Related

How to create a function to split date and time from a string in SQL?

How can I remove value before '_' and show date and time in one row in TSQL Function?
Below is sample:
Declare #inputstring as varchar(50) = 'Studio9_20230126_203052' ;
select value from STRING_SPLIT( #inputstring ,'_')
Output Required: 2023-01-26 20:30:52.000
If we can safely assume that the value is always in the format {Some String}_{yyyyMMdd}_{hhmmss} then you can use STUFF a few times, firstly to remove the leading string up to the first underscore (_) character (using CHARINDEX to find that character), and then to inject 2 colon (:) characters. Finally you can REPLACE the remaining underscore with a space ( ), and then use TRY_CONVERT to attempt to convert the value to a datetime2(0).
DECLARE #inputstring varchar(50) = 'Studio9_20230126_203052';
SELECT TRY_CONVERT(datetime2(0),REPLACE(STUFF(STUFF(STUFF(#inputstring,1,CHARINDEX('_',#inputstring),''),14,0,':'),12,0,':'),'_',' '));
Note that this doesn't give the value you state you want in your question (2023-01-26 20:05:52.000) , but I assume this is a typographical error, and that the 05 for minutes should be 30.
Creating function
CREATE FUNCTION [dbo].[convert_to_date] (#inputstring NVARCHAR(MAX))
RETURNS DATETIME AS
BEGIN
DECLARE #finalString varchar(50), #out varchar(100)
SET #finalString = REPLACE ( (SUBSTRING (#inputstring, CHARINDEX('_', #inputstring)+1 , LEN(#inputstring))), '_', ' ')
--SELECT #finalString
SET #out = LEFT (#finalString, 4) + '-'
+ SUBSTRING(#finalString, 5, 2) + '-'
+ SUBSTRING(#finalString, 7, 2) + ' '
+ SUBSTRING(#finalString, 10, 2) + ':'
+ SUBSTRING(#finalString, 12, 2) + ':'
+ SUBSTRING(#finalString, 14, 2) + '.000'
RETURN #out
END
Select Query
SELECT dbo.[convert_to_date] ('Studio54541659_20230126_203052')
Output
2023-01-26 20:30:52.000
This will tolerate "somestring" in the format of "somestring_YYYYMMDD_HHMISS" being variable in length.
Declare #inputstring as varchar(50) = 'Studio9_20230126_203052' ;
SELECT DateAndTime = CONVERT(DATETIME,STUFF(STUFF(STUFF(v2.DT,14,0,':'),12,0,':'),9,1,' '))
,Identifier = LEFT(#inputstring,v1.Pos1-1) --Included this because I know how people are :D --Comment out if not wanted.
,Original = #inputstring --Original string just for checking. Comment out when happy.
FROM (VALUES(CHARINDEX('_',#inputstring)))v1(Pos1) --Position of first Underscore
CROSS APPLY (VALUES(SUBSTRING(#inputstring,v1.Pos1+1,50)))v2(DT) --String after first Underscore
;
Output looks like this and you end up with a DATETIME datatype. Comment out what you don't want for columns in the return.
I'll let you have some of the fun by converting it into an iTVF (inline Table Valued Function). Remember that any function that contains a "BEGIN" is ultimately going to be a part of a performance issue so make sure it's an iTVF :D
EDIT: Crud... I've gotta remember to scroll down. #Lamu already posted the same thing but it's probably better and fast if you just want the time and not the identifier I included.

Identify Hidden Characters

In my SQL tables I have text which has hidden characters which is only visible when I copy and paste it in notepad++.
How to find those rows which has hidden characters using SQL Server queries?
I have tried comparing the lengths using datalength and len
it did not work.
DATALENGTH(name) AS BinaryLength != LEN(name)
I want the row which has hidden characters.
On the assumption that this is being caused by control characters. Some of which are invisible. But also include tabs, newlines and spaces. An example to illustrate and how to get them to appear.
--DROP TABLE #SillyTemp
DECLARE #InvisibleChar1 NCHAR(1) = NCHAR(28), #InvisibleChar2 NCHAR(1) = NCHAR(30), #NonControlChar NCHAR(1) = NCHAR(33);
DECLARE #InputString NVARCHAR(500) = N'Some |' + #InvisibleChar1 +'| random string |' + #InvisibleChar2 + '|' + '; Thank god Finally a normal character |' + #NonControlChar + '|';
SELECT #InputString AS OhNoInvisibleCharacters
DECLARE #ControlCharRange NVARCHAR(50) = N'%[' + NCHAR(1) + '-' + NCHAR(31) + ']%';
CREATE TABLE #SillyTemp
(
input nvarchar(500)
)
INSERT INTO #SillyTemp(input)
VALUES (#InputString),(N'A normal string')
SELECT #ControlCharRange;
SELECT input FROM #SillyTemp AS #SI WHERE input LIKE #ControlCharRange;
This produces 3 results. A string with invisiblechars within them like such:
Some || random string ||; Thank god Finally a normal character |!|
Note, the are actually invisible inside SQL. But stackoverflow shows them as such. The output in SQL Server is simply.
Some || random string ||; Thank god Finally a normal character |!|
But these characters still have a corresponding (N)CHAR(X) value. (N)CHAR(0) is a NULL character and is highly unlikely to be in a string, in my setup to detect them it also provides some problems in building a range. (N)CHAR(32) is the ' ' space character.
The way the [X-Y] string operator works is also based on the (N)CHAR numbers. Therefore we can make a range of [NCHAR(1)-NCHAR(31)]
The last select goes through the temporary table, one which has invisible characters. Since we're looking for any NCHARS between 1 and 31, only those with invisible characters (and often invalid characters or tabs/newlines) satisfy the where condition. Thus only they get returned. In this case only the 'faulty' string gets returned in my select statement.

Check a word starting with specific string [SQL Server]

I try to search on a string like Dhaka is the capital of Bangladesh which contain six words. If my search text is cap (which is the starting text of capital), it will give me the starting index of the search text in the string (14 here). And if the search text contain in the string but not starting text any of the word, it will give me 0. Please take a look at the Test Case for better understanding.
What I tried
DECLARE #SearchText VARCHAR(20),
#Str VARCHAR(MAX),
#Result INT
SET #Str = 'Dhaka is the capital of Bangladesh'
SET #SearchText = 'cap'
SET #Result = CASE WHEN #Str LIKE #SearchText + '%'
OR #Str LIKE + '% ' + #SearchText + '%'
THEN CHARINDEX(#SearchText, #Str)
ELSE 0 END
PRINT #Result -- print 14 here
For my case, I need to generate #Str with another sql function. Here, we need to generate #Str 3 times which is costly (I think). So, is there any way so that I need generate #Str only one time? [Is that possible by using PATINDEX]
Note: CASE condition appear in the where clause at my original query. So, It is not possible to set the #Str value in variable then use it in the where clause.
Test Case
Search Text: Dhaka, Result: 1
Search Text: tal, Result: 0
Search Text: Mirpur, Result: 0
Search Text: isthe, Result: 0
Search Text: is the, Result: 7
Search Text: Dhaka Capital, Result: 0
Simply add a leading space to the strings to ensure that you always find only the beginning of a word:
DECLARE #SearchText VARCHAR(20),
#Str VARCHAR(MAX),
#Result INT
SET #Str = 'Dhaka is the capital of Bangladesh'
SET #SearchText = 'Dhaka Capital'
SET #Result = CHARINDEX(' ' + #SearchText, ' ' + #Str)
PRINT #Result -- print 14 here
I have tested the above query against your test cases and it seems to work.
To compute the function only once per row in SELECT make it table valued function. Or if it's impossible for some reason use CROSS APPLY
SELECT .. a, b,
FROM ..
CROSS APPLY (SELECT my_scalar_fn(a,b) as Str) arg
WHERE CASE WHEN arg.Str LIKE SearchText + '%'
OR arg.Str LIKE + '% ' + SearchText + '%'
THEN CHARINDEX(SearchText, arg.Str)
ELSE 0 END

Trim String After Keyword

I have a column that contains status changes, but I don't want to return the whole string. Is there any way to return just a part of a string after a certain keyword? Every value of the column is in the format of From X to Y where X and Y could be a single word or multiple words. I've looked at the substring and trim functions, but those seem to require knowledge of how many spaces you want to keep.
Edit: I want to keep part Y from the status and get rid of 'From X to'.
You can use a combination of Charindex and Substring and Len to do it.
Try this:
select SUBSTRING(field,charindex('keyword',field), LEN('keyword'))
So this will find Flop and extract it wherever it is in the field
select SUBSTRING('bullflop',charindex('flop','bullflop'), LEN('flop'))
EDIT:
To get the remainder then just set LEN to the field LEN(field)
declare #field varchar(200)
set #field = 'this is bullflop and other such junk'
select SUBSTRING(#field,charindex('flop',#field), LEN(#field) )
EDIT 2:
Now I understand, here is a quick and dirty version...
declare #field varchar(200)
set #field = 'From X to Y'
select Replace(SUBSTRING(#field,charindex('to ',#field), LEN(#field) ), 'to ','')
Returns:
Y
EDIT 3:
Cory is right, this is cleaner.
declare #field varchar(200) = 'From X to Y'
declare #keyword varchar(200) = 'to '
select SUBSTRING(#field,charindex(#keyword,#field) + LEN(#keyword), LEN(#field) )
Other answers are fine, but I like the STUFF() function and it doesn't seem to be well-known, so here's another option:
DECLARE #field VARCHAR(50) = 'From Authorized to Auth Not Needed'
,#keyword VARCHAR(50) = ' to '
SELECT STUFF(#field,1,CHARINDEX(#keyword,#field)+LEN(#keyword),'')
STUFF() is like SUBSTRING() and REPLACE() combined, you feed it a string, a start position and a length, and can replace that with anything or in your case, nothing ''.
From MSDN:
STUFF ( character_expression , start , length , replaceWith_expression )
You can combine a few string functions to do what you want:
DECLARE #Field varchar(100) = 'From A to Z'
DECLARE #Keyword varchar(100) = 'to'
-- Method 1 (Find the keyword, then take the remainder of the string)
SELECT LTRIM(SUBSTRING(#Field,
CHARINDEX(#Keyword, #Field, 0) + LEN(#Keyword), LEN(#Field)))
EDIT:
-- Method 2 (Take from the right the characters up to the keyword)
SELECT RIGHT(#Field, LEN(#Field) - CHARINDEX(#Keyword, #Field, 0) - LEN(#Keyword))
Produces:
'Z'

How can I remove leading and trailing quotes in SQL Server?

I have a table in a SQL Server database with an NTEXT column. This column may contain data that is enclosed with double quotes. When I query for this column, I want to remove these leading and trailing quotes.
For example:
"this is a test message"
should become
this is a test message
I know of the LTRIM and RTRIM functions but these workl only for spaces. Any suggestions on which functions I can use to achieve this.
I have just tested this code in MS SQL 2008 and validated it.
Remove left-most quote:
UPDATE MyTable
SET FieldName = SUBSTRING(FieldName, 2, LEN(FieldName))
WHERE LEFT(FieldName, 1) = '"'
Remove right-most quote: (Revised to avoid error from implicit type conversion to int)
UPDATE MyTable
SET FieldName = SUBSTRING(FieldName, 1, LEN(FieldName)-1)
WHERE RIGHT(FieldName, 1) = '"'
I thought this is a simpler script if you want to remove all quotes
UPDATE Table_Name
SET col_name = REPLACE(col_name, '"', '')
You can simply use the "Replace" function in SQL Server.
like this ::
select REPLACE('this is a test message','"','')
note: second parameter here is "double quotes" inside two single quotes and third parameter is simply a combination of two single quotes. The idea here is to replace the double quotes with a blank.
Very simple and easy to execute !
My solution is to use the difference in the the column values length compared the same column length but with the double quotes replaced with spaces and trimmed in order to calculate the start and length values as parameters in a SUBSTRING function.
The advantage of doing it this way is that you can remove any leading or trailing character even if it occurs multiple times whilst leaving any characters that are contained within the text.
Here is my answer with some test data:
SELECT
x AS before
,SUBSTRING(x
,LEN(x) - (LEN(LTRIM(REPLACE(x, '"', ' ')) + '|') - 1) + 1 --start_pos
,LEN(LTRIM(REPLACE(x, '"', ' '))) --length
) AS after
FROM
(
SELECT 'test' AS x UNION ALL
SELECT '"' AS x UNION ALL
SELECT '"test' AS x UNION ALL
SELECT 'test"' AS x UNION ALL
SELECT '"test"' AS x UNION ALL
SELECT '""test' AS x UNION ALL
SELECT 'test""' AS x UNION ALL
SELECT '""test""' AS x UNION ALL
SELECT '"te"st"' AS x UNION ALL
SELECT 'te"st' AS x
) a
Which produces the following results:
before after
-----------------
test test
"
"test test
test" test
"test" test
""test test
test"" test
""test"" test
"te"st" te"st
te"st te"st
One thing to note that when getting the length I only need to use LTRIM and not LTRIM and RTRIM combined, this is because the LEN function does not count trailing spaces.
I know this is an older question post, but my daughter came to me with the question, and referenced this page as having possible answers. Given that she's hunting an answer for this, it's a safe assumption others might still be as well.
All are great approaches, and as with everything there's about as many way to skin a cat as there are cats to skin.
If you're looking for a left trim and a right trim of a character or string, and your trailing character/string is uniform in length, here's my suggestion:
SELECT SUBSTRING(ColName,VAR, LEN(ColName)-VAR)
Or in this question...
SELECT SUBSTRING('"this is a test message"',2, LEN('"this is a test message"')-2)
With this, you simply adjust the SUBSTRING starting point (2), and LEN position (-2) to whatever value you need to remove from your string.
It's non-iterative and doesn't require explicit case testing and above all it's inline all of which make for a cleaner execution plan.
The following script removes quotation marks only from around the column value if table is called [Messages] and the column is called [Description].
-- If the content is in the form of "anything" (LIKE '"%"')
-- Then take the whole text without the first and last characters
-- (from the 2nd character and the LEN([Description]) - 2th character)
UPDATE [Messages]
SET [Description] = SUBSTRING([Description], 2, LEN([Description]) - 2)
WHERE [Description] LIKE '"%"'
You can use following query which worked for me-
For updating-
UPDATE table SET colName= REPLACE(LTRIM(RTRIM(REPLACE(colName, '"', ''))), '', '"') WHERE...
For selecting-
SELECT REPLACE(LTRIM(RTRIM(REPLACE(colName, '"', ''))), '', '"') FROM TableName
you could replace the quotes with an empty string...
SELECT AllRemoved = REPLACE(CAST(MyColumn AS varchar(max)), '"', ''),
LeadingAndTrailingRemoved = CASE
WHEN MyTest like '"%"' THEN SUBSTRING(Mytest, 2, LEN(CAST(MyTest AS nvarchar(max)))-2)
ELSE MyTest
END
FROM MyTable
Some UDFs for re-usability.
Left Trimming by character (any number)
CREATE FUNCTION [dbo].[LTRIMCHAR] (#Input NVARCHAR(max), #TrimChar CHAR(1) = ',')
RETURNS NVARCHAR(max)
AS
BEGIN
RETURN REPLACE(REPLACE(LTRIM(REPLACE(REPLACE(#Input,' ','¦'), #TrimChar, ' ')), ' ', #TrimChar),'¦',' ')
END
Right Trimming by character (any number)
CREATE FUNCTION [dbo].[RTRIMCHAR] (#Input NVARCHAR(max), #TrimChar CHAR(1) = ',')
RETURNS NVARCHAR(max)
AS
BEGIN
RETURN REPLACE(REPLACE(RTRIM(REPLACE(REPLACE(#Input,' ','¦'), #TrimChar, ' ')), ' ', #TrimChar),'¦',' ')
END
Note the dummy character '¦' (Alt+0166) cannot be present in the data (you may wish to test your input string, first, if unsure or use a different character).
To remove both quotes you could do this
SUBSTRING(fieldName, 2, lEN(fieldName) - 2)
you can either assign or project the resulting value
You can use TRIM('"' FROM '"this "is" a test"') which returns: this "is" a test
CREATE FUNCTION dbo.TRIM(#String VARCHAR(MAX), #Char varchar(5))
RETURNS VARCHAR(MAX)
BEGIN
RETURN SUBSTRING(#String,PATINDEX('%[^' + #Char + ' ]%',#String)
,(DATALENGTH(#String)+2 - (PATINDEX('%[^' + #Char + ' ]%'
,REVERSE(#String)) + PATINDEX('%[^' + #Char + ' ]%',#String)
)))
END
GO
Select dbo.TRIM('"this is a test message"','"')
Reference : http://raresql.com/2013/05/20/sql-server-trim-how-to-remove-leading-and-trailing-charactersspaces-from-string/
I use this:
UPDATE DataImport
SET PRIO =
CASE WHEN LEN(PRIO) < 2
THEN
(CASE PRIO WHEN '""' THEN '' ELSE PRIO END)
ELSE REPLACE(PRIO, '"' + SUBSTRING(PRIO, 2, LEN(PRIO) - 2) + '"',
SUBSTRING(PRIO, 2, LEN(PRIO) - 2))
END
Try this:
SELECT left(right(cast(SampleText as nVarchar),LEN(cast(sampleText as nVarchar))-1),LEN(cast(sampleText as nVarchar))-2)
FROM TableName