How to identify and redact all instances of a matching pattern in T-SQL

How to identify and redact all instances of a matching pattern in T-SQL - sql

I have a requirement to run a function over certain fields to identify and redact any numbers which are 5 digits or longer, ensuring all but the last 4 digits are replaced with *
For example: "Some text with 12345 and 1234 and 12345678" would become "Some text with *2345 and 1234 and ****5678"
I've used PATINDEX to identify the the starting character of the pattern:
PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', TEST_TEXT)
I can recursively call that to get the starting character of all the occurrences, but I'm struggling with the actual redaction.
Does anyone have any pointers on how this can be done? I know to use REPLACE to insert the *s where they need to be, it's just the identification of what I should actually be replacing I'm struggling with.
Could do it on a program, but I need it to be T-SQL (can be a function if needed).
Any tips greatly appreciated!

You can do this using the built in functions of SQL Server. All of which used in this example are present in SQL Server 2008 and higher.
DECLARE #String VARCHAR(500) = 'Example Input: 1234567890, 1234, 12345, 123456, 1234567, 123asd456'
DECLARE #StartPos INT = 1, #EndPos INT = 1;
DECLARE #Input VARCHAR(500) = ISNULL(#String, '') + ' '; --Sets input field and adds a control character at the end to make the loop easier.
DECLARE #OutputString VARCHAR(500) = ''; --Initalize an empty string to avoid string null errors
WHILE (#StartPOS <> 0)
BEGIN
SET #StartPOS = PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #Input);
IF #StartPOS <> 0
BEGIN
SET #OutputString += SUBSTRING(#Input, 1, #StartPOS - 1); --Seperate all contents before the first occurance of our filter
SET #Input = SUBSTRING(#Input, #StartPOS, 500); --Cut the entire string to the end. Last value must be greater than the original string length to simply cut it all.
SET #EndPos = (PATINDEX('%[0-9][0-9][0-9][0-9][^0-9]%', #Input)); --First occurance of 4 numbers with a not number behind it.
SET #Input = STUFF(#Input, 1, (#EndPos - 1), REPLICATE('*', (#EndPos - 1))); --#EndPos - 1 gives us the amount of chars we want to replace.
END
END
SET #OutputString += #Input; --Append the last element
SET #OutputString = LEFT(#OutputString, LEN(#OutputString))
SELECT #OutputString;
Which outputs the following:
Example Input: ******7890, 1234, *2345, **3456, ***4567, 123asd456
This entire code could also be made as a function since it only requires an input text.

A dirty solution with recursive CTE
DECLARE
#tags nvarchar(max) = N'Some text with 12345 and 1234 and 12345678',
#c nchar(1) = N' ';
;
WITH Process (s, i)
as
(
SELECT #tags, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #tags)
UNION ALL
SELECT value, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', value)
FROM
(SELECT SUBSTRING(s,0,i)+'*'+SUBSTRING(s,i+4,len(s)) value
FROM Process
WHERE i >0) calc
-- we surround the value and the string with leading/trailing ,
-- so that cloth isn't a false positive for clothing
)
SELECT * FROM Process
WHERE i=0
I think a better solution it's to add clr function in Ms SQL Server to manage regexp.
sql-clr/RegEx

Here is an option using the DelimitedSplit8K_LEAD which can be found here. https://www.sqlservercentral.com/articles/reaping-the-benefits-of-the-window-functions-in-t-sql-2 This is an extension of Jeff Moden's splitter that is even a little bit faster than the original. The big advantage this splitter has over most of the others is that it returns the ordinal position of each element. One caveat to this is that I am using a space to split on based on your sample data. If you had numbers crammed in the middle of other characters this will ignore them. That may be good or bad depending on you specific requirements.
declare #Something varchar(100) = 'Some text with 12345 and 1234 and 12345678';
with MyCTE as
(
select x.ItemNumber
, Result = isnull(case when TRY_CONVERT(bigint, x.Item) is not null then isnull(replicate('*', len(convert(varchar(20), TRY_CONVERT(bigint, x.Item))) - 4), '') + right(convert(varchar(20), TRY_CONVERT(bigint, x.Item)), 4) end, x.Item)
from dbo.DelimitedSplit8K_LEAD(#Something, ' ') x
)
select Output = stuff((select ' ' + Result
from MyCTE
order by ItemNumber
FOR XML PATH('')), 1, 1, '')
This produces: Some text with *2345 and 1234 and ****5678

Related

SSMS replace all commas outside of quotation marks in string

I've written the following function in SSMS to replace any commas that are outside of quotation marks with ||||:
CREATE FUNCTION dbo.fixqualifier (#string nvarchar(max))
returns nvarchar(max)
as begin
DECLARE #STRINGTOPAD NVARCHAR(MAX)
DECLARE #position int = 1,#newstring nvarchar(max) ='',#QUOTATIONMODE INT = 0
WHILE(LEN(#string)>0)
BEGIN
SET #STRINGTOPAD = SUBSTRING(#string,0,IIF(#STRING LIKE '%"%',CHARINDEX('"',#string),LEN(#STRING)))
SET #newstring = #newstring + IIF(#QUOTATIONMODE = 1, REPLACE(#STRINGTOPAD,',','||||'),#STRINGTOPAD)
SET #QUOTATIONMODE = IIF(#QUOTATIONMODE = 1,0,1)
set #string = SUBSTRING(#string,1+IIF(#STRING LIKE '%"%',CHARINDEX('"',#string),LEN(#STRING)),LEN(#string))
END
return #newstring
end
The idea is for the function to find the first ", replace all ',' before that then switch to quotation mode 1 so it knows to not replace the , until it changes back to quotation mode 0 when it hits the 2nd " and so on.
so for example the string:
qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl
would become:
qwer||||tyu||||io||||asd||||"edffs,asdfgh"||||"jjkzx"||||kl
It works as expected but it's really inefficient when it comes to doing this for several thousand rows.
Is there a better way or doing this or at least speeding the function up.

Do a simple trick by Modulus
DECLARE #VAR VARCHAR(100) = 'qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl'
,#OUTPUT VARCHAR(100) = '';
SELECT #OUTPUT = #OUTPUT + CASE WHEN (LEN(#OUTPUT) - LEN(REPLACE(#OUTPUT, '"', ''))) % 2 = 0
THEN REPLACE(VAL, ',', '||||') ELSE VAL END
FROM (
SELECT SUBSTRING(#VAR, NUMBER, 1) VAL
FROM master.dbo.spt_values
WHERE type = 'P'
AND NUMBER BETWEEN 1 AND LEN(#VAR)
) A
PRINT #OUTPUT
Result:
qwer||||tyu||||io||||asd||||"edffs,asdfgh"||||"jjkzx"||||kl
By this LEN(#OUTPUT) - LEN(REPLACE(#OUTPUT, '"', '')) expression, you will get count of ". By taking Modulus of the count %2, if it is zero its even then you can replace commas, otherwise you will keep them.

This uses DelimitedSplit8k and completely avoids any RBAR methods (such as a WHILE or #Variable = #Variable +... (which is a hidden form of RBAR)).
It firstly splits on the quotation, and then on the commas, where the string isn't quoted. Finally it then puts the strings back together again, using the "old" STUFF and FOR XML PATH method:
USE Sandbox;
DECLARE #String varchar(8000) = 'qwer,tyu,io,asd,"edffs,asdfgh","jjkzx",kl';
WITH Splits AS(
SELECT QS.ItemNumber AS QuoteNumber, CS.ItemNumber AS CommaNumber, ISNULL(CS.Item, '"' + QS.Item + '"') AS DelimitedItem
FROM dbo.DelimitedSplit8K(#string,'"') QS
OUTER APPLY (SELECT *
FROM dbo.DelimitedSplit8K(QS.Item,',')
WHERE QS.ItemNumber % 2 = 1) CS
WHERE QS.Item <> ',')
SELECT STUFF((SELECT '||||' + S.DelimitedItem
FROM Splits S
ORDER BY S.QuoteNumber, S.CommaNumber
FOR XML PATH('')),1,1,'') AS DelimitedList;
(Note, DelimitedSplit8K does not accept more than 8,000 characters. If you have more than that, SQL Server is really not the right tool. STRING_SPLIT does not provide the ordinal position, so you would be unable to guarantee the rebuild order with it.)

Extract substring from string if certain characters exists SQL

I have a string:
DECLARE #UserComment AS VARCHAR(1000) = 'bjones marked inspection on system UP for site COL01545 as Refused to COD won''t pay upfront :Routeid: 12 :Inspectionid: 55274'
Is there a way for me to extract everything from the string after 'Inspectionid: ' leaving me just the InspectionID to save into a variable?

Your example doesn't quite work correctly. You defined your variable as varchar(100) but there are more characters in your string than that.
This should work based on your sample data.
DECLARE #UserComment AS VARCHAR(1000) = 'bjones marked inspection on system UP for site COL01545 as Refused to COD won''t pay upfront :Routeid: 12 :Inspectionid: 55274'
select right(#UserComment, case when charindex('Inspectionid: ', #UserComment, 0) > 0 then len(#UserComment) - charindex('Inspectionid: ', #UserComment, 0) - 13 else len(#UserComment) end)

I would do this as:
select stuff(#UserComment, 1, charindex(':Inspectionid: ', #UserComment) + 14, '')
This works even if the string is not found -- although it will return the whole string. To get an empty string in this case:
select stuff(#UserComment, 1, charindex(':Inspectionid: ', #UserComment + ':Inspectionid: ') + 14, '')

Firstly, let me say that your #UserComment variable is not long enough to contain the text you're putting into it. Increase the size of that first.
The SQL below will extract the value:
DECLARE #UserComment AS VARCHAR(1000); SET #UserComment = 'bjones marked inspection on system UP for site COL01545 as Refused to COD won''t pay upfront :Routeid: 12 :Inspectionid: 55274'
DECLARE #pos int
DECLARE #InspectionId int
DECLARE #IdToFind varchar(100)
SET #IdToFind = 'Inspectionid: '
SET #pos = CHARINDEX(#IdToFind, #UserComment)
IF #pos > 0
BEGIN
SET #InspectionId = CAST(SUBSTRING(#UserComment, #pos+LEN(#IdToFind)+1, (LEN(#UserComment) - #pos) + 1) AS INT)
PRINT #InspectionId
END
You could make the above code into a SQL function if necessary.

If the Inspection ID is always 5 digits then the last argument for the Substring function (length) can be 5, i.e.
SELECT SUBSTRING(#UserComment,PATINDEX('%Inspectionid:%',#UserComment)+14,5)
If the Inspection ID varies (but is always at the end - which your question slightly implies), then the last argument can be derived by subtracting the position of 'InspectionID:' from the overall length of the string. Like this:
SELECT SUBSTRING(#UserComment,PATINDEX('%Inspectionid:%',#UserComment)+14,LEN(#usercomment)-(PATINDEX('%Inspectionid:%',#UserComment)+13))

T-SQL How to create function that compares string, checks difference, and do special function

First - sorry for my english. Second - i'm learning t-SQL.
Goal:
I want to get difference between two strings, then check in which column is this difference. If the difference is in first column, do something, if in second column - do something else.
What I'm actually doing
Column 'messages' is a string which contains list of ID. So i am replacing all '#' with ',' and deleting last ',' what gives to me ActualID and BeforeID column. See below:
DECLARE #string VARCHAR(512);
DECLARE #string2 VARCHAR(512);
DECLARE #string3 VARCHAR(512);
SET #string = '41#42#43#44#45#46#47#48#49#50#51#52#53#54#55#56#57#58#59#';
SET #string2 = REPLACE((SELECT messages FROM USERS WHERE userid = 4), '#', ', ' )
SET #string3 = left(#string2, len(#string2) - 1);
SET #string2 = REPLACE(#string, '#', ', ' )
SET #string = left(#string2, len(#string2) - 1);
SELECT #string3 as ActualID, #string as BeforeID
So now, I want compare BeforeID with ActualID. For example:
In BeforeID we have 1, 2, 3 / In ActualID 1, 2, 3, 4
In example above 4 was added. So, if it was added I want to add it to #AddedElements.
If 4, 5, 7 were added then SELECT #AddedElements as AddedElements should return 4, 5, 7 (With comas)
But, that's not all.
If BeforeID = 1, 5, 10, 14 and ActualID = 1, 5, 14 I want, that element which is in BeforeID, but not in AcutalID will be added to #DeletedElements.
So SELECT #DeletedElements as DeletedElements should return 10
Added elements/Deleted elements should be returned once. I mean, full result what I want to Earn should be
SELECT #AddedElements as AddedElements, #DeletedElements as DeletedElements
Is it possible? If, then how to do it?

First of all, I have to start by saying that this is just poor design; but having said that, I've also found myself in all kinds of situations where I couldn't change the way things worked, only try to make them work better in the current configuration. Therefore, I recommend something like this:
1: Create a UDF (User-Defined Function) that can handle splitting the strings and returning them in table-formed data that you can work with:
CREATE FUNCTION [dbo].[UDF_StringDelimiter]
/*********************************************************
** Takes Parameter "LIST" and transforms it for use **
** to select individual values or ranges of values. **
** **
** EX: 'This,is,a,test' = 'This' 'Is' 'A' 'Test' **
*********************************************************/
(
#LIST VARCHAR(8000)
,#DELIMITER VARCHAR(255)
)
RETURNS #TABLE TABLE
(
[RowID] INT IDENTITY
,[Value] VARCHAR(255)
)
WITH SCHEMABINDING
AS
BEGIN
DECLARE
#LISTLENGTH AS SMALLINT
,#LISTCURSOR AS SMALLINT
,#VALUE AS VARCHAR(255)
;
SELECT
#LISTLENGTH = LEN(#LIST) - LEN(REPLACE(#LIST,#DELIMITER,'')) + 1
,#LISTCURSOR = 1
,#VALUE = ''
;
WHILE #LISTCURSOR <= #LISTLENGTH
BEGIN
INSERT INTO #TABLE (Value)
SELECT
CASE
WHEN #LISTCURSOR < #LISTLENGTH
THEN SUBSTRING(#LIST,1,PATINDEX('%' + #DELIMITER + '%',#LIST) - 1)
ELSE SUBSTRING(#LIST,1,LEN(#LIST))
END
;
SET #LIST = STUFF(#LIST,1,PATINDEX('%' + #DELIMITER + '%',#LIST),'')
;
SET #LISTCURSOR = #LISTCURSOR + 1
;
END
;
RETURN
;
END
;
2: Consider dropping the whole "Switching out commas" thing, because it's pointless - the function I've written here takes two arguments: The string itself, and the delimiter (the mini-string that separates the individual strings within the big string, in your case '#') Then you just have to do a couple of quick comparisons to find out what was added and what was deleted.
DECLARE
#AddedElements VARCHAR(255) = ''
,#DeletedElements VARCHAR(255) = ''
,#ActualID VARCHAR(255) = '41#42#43#44#45#46#47#48#49#50#51#52#53#54#55#56#57#58#59#'
,#BeforeID VARCHAR(255) = '41#42#43#44#45#46#47#48#50#51#52#53#54#55#56#57#58#59#60#'
;
SET #AddedElements = #AddedElements +
SUBSTRING(
(
SELECT ', ' + Value
FROM dbo.UDF_StringDelimiter(#ActualID,'#')
WHERE Value NOT IN
(
SELECT Value
FROM dbo.UDF_StringDelimiter(#BeforeID,'#')
)
GROUP BY ', ' + Value
FOR XML PATH('')
)
,3,255)
;
SET #DeletedElements = #DeletedElements +
SUBSTRING(
(
SELECT ', ' + Value
FROM dbo.UDF_StringDelimiter(#BeforeID,'#')
WHERE Value NOT IN
(
SELECT Value
FROM dbo.UDF_StringDelimiter(#ActualID,'#')
)
GROUP BY ', ' + Value
FOR XML PATH('')
)
,3,255)
;
SELECT #AddedElements AS AddedElements,#DeletedElements AS DeletedElements
;
Using this method, if you add a value to #ActualID that does not exist in #BeforeID, it will show up in #AddedElements.
Likewise , if you remove an element from #ActualID that had previously existed in #BeforeID, it will show up in #DeletedElements.
All of this is, of course, assuming that the dynamic string (the one really being compared here) is the #ActualID. I operated with the understanding that #BeforeID is actually a stored value in the DB, and #ActualID is a dynamic string being passed in from...somewhere. If this is wrong, update me and I'll change the tactic appropriately.
Quick note: It's important to me to point out that this is just one way of dealing with a situation like this, and I'm sure there are better ways; but with the information I have, it's the best I could come up with without spending too much time and energy on it.

Extract a number from String in SQL

I have the following string:
"FLEETWOOD DESIGNS 535353110XXXXX" (The X's are actually numbers I just wanted to hide them here)
Does anyone know how can I search through Strings in SQL and extract numbers that are greater then lets say 10 characters long?

This a quite old post but might help anyone else. I was searching for an user defined function in SQL Server to extract only the numbers of a given string, and, surprisingly I could not find exactly what I was looking for.
Let me put here the code of a function to "Extract a number from string in SQL" (valid for SQL Server). This is taken from the fantastic blog of Pinal Dave, I've modified it just to return NULL is a NULL value is passed to the function.
CREATE FUNCTION [dbo].[ExtractInteger](#String VARCHAR(2000))
RETURNS VARCHAR(1000)
AS
BEGIN
DECLARE #Count INT
DECLARE #IntNumbers VARCHAR(1000)
SET #Count = 0
SET #IntNumbers = ''
IF #String IS NULL
RETURN NULL;
WHILE #Count <= LEN(#String)
BEGIN
IF SUBSTRING(#String,#Count,1) >= '0' AND SUBSTRING(#String,#Count,1) <= '9'
BEGIN
SET #IntNumbers = #IntNumbers + SUBSTRING(#String,#Count,1)
END
SET #Count = #Count + 1
END
RETURN #IntNumbers
END
Tests
select '"' + dbo.ExtractInteger('1a2b3c4d5e6f7g8h9i') + '"'
GO
select '"' + dbo.ExtractInteger('abcdefghi') + '"'
GO
select '"' + dbo.ExtractInteger(NULL) + '"'
GO
select '"' + dbo.ExtractInteger('') + '"'
GO
Results
"123456789"
""
NULL
""

You don't mention the DB engine, so we don't know what features are available...
If regexpressions are available then pattern like \d{10,} would match numbers with 10 or more digit.
In mySQL REGEXP can only return true or false (0 or 1) so you'd have to use some ugly hack like
SELECT
LEAST(
INSTR(field,'0'),
INSTR(field,'1'),
INSTR(field,'2'),
INSTR(field,'3'),
INSTR(field,'4'),
INSTR(field,'5'),
INSTR(field,'6'),
INSTR(field,'7'),
INSTR(field,'8'),
INSTR(field,'9')
) AS startPos,
REVERSE(field) AS backward,
LEAST(
INSTR(backward,'0'),
INSTR(backward,'1'),
INSTR(backward,'2'),
INSTR(backward,'3'),
INSTR(backward,'4'),
INSTR(backward,'5'),
INSTR(backward,'6'),
INSTR(backward,'7'),
INSTR(backward,'8'),
INSTR(backward,'9')
) AS endPos,
SUBSTRING(field, startPos, endPos - startPos + 1)
FROM tab
WHERE(field REGEXP '[0-9]{10,}')
but this isn't perfect - it would extract false substring for string like "ABC 9 A 1234567891", not to mention that it is probably so slooooow that it is faster to go througt data by hand.

SUBSTRING('FLEETWOOD DESIGNS 535353110XXXXX', 18, 32)
You could also use LEN() to get the length of the string itself. If you know the serial number length, you can just subtract that from the end index to get your start index of the substring.

It could be done like this
Declare #X varchar(100)
Select #X= 'Here is where15234Numbers'
--
Select #X= SubString(#X,PATINDEX('%[0-9]%',#X),Len(#X))
Select #X= SubString(#X,0,PATINDEX('%[^0-9]%',#X))
--// show result
Select #X

How to count instances of character in SQL Column

I have an sql column that is a string of 100 'Y' or 'N' characters. For example:
YYNYNYYNNNYYNY...
What is the easiest way to get the count of all 'Y' symbols in each row.

This snippet works in the specific situation where you have a boolean: it answers "how many non-Ns are there?".
SELECT LEN(REPLACE(col, 'N', ''))
If, in a different situation, you were actually trying to count the occurrences of a certain character (for example 'Y') in any given string, use this:
SELECT LEN(col) - LEN(REPLACE(col, 'Y', ''))

In SQL Server:
SELECT LEN(REPLACE(myColumn, 'N', ''))
FROM ...

This gave me accurate results every time...
This is in my Stripes field...
Yellow, Yellow, Yellow, Yellow, Yellow, Yellow, Black, Yellow, Yellow, Red, Yellow, Yellow, Yellow, Black
11 Yellows
2 Black
1 Red
SELECT (LEN(Stripes) - LEN(REPLACE(Stripes, 'Red', ''))) / LEN('Red')
FROM t_Contacts

DECLARE #StringToFind VARCHAR(100) = "Text To Count"
SELECT (LEN([Field To Search]) - LEN(REPLACE([Field To Search],#StringToFind,'')))/COALESCE(NULLIF(LEN(#StringToFind), 0), 1) --protect division from zero
FROM [Table To Search]

This will return number of occurance of N
select ColumnName, LEN(ColumnName)- LEN(REPLACE(ColumnName, 'N', ''))
from Table

The easiest way is by using Oracle function:
SELECT REGEXP_COUNT(COLUMN_NAME,'CONDITION') FROM TABLE_NAME

Maybe something like this...
SELECT
LEN(REPLACE(ColumnName, 'N', '')) as NumberOfYs
FROM
SomeTable

Below solution help to find out no of character present from a string with a limitation:
1) using SELECT LEN(REPLACE(myColumn, 'N', '')), but limitation and
wrong output in below condition:
SELECT LEN(REPLACE('YYNYNYYNNNYYNY', 'N', ''));
--8 --Correct
SELECT LEN(REPLACE('123a123a12', 'a', ''));
--8 --Wrong
SELECT LEN(REPLACE('123a123a12', '1', ''));
--7 --Wrong
2) Try with below solution for correct output:
Create a function and also modify as per requirement.
And call function as per below
select dbo.vj_count_char_from_string('123a123a12','2');
--2 --Correct
select dbo.vj_count_char_from_string('123a123a12','a');
--2 --Correct
-- ================================================
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: VIKRAM JAIN
-- Create date: 20 MARCH 2019
-- Description: Count char from string
-- =============================================
create FUNCTION vj_count_char_from_string
(
#string nvarchar(500),
#find_char char(1)
)
RETURNS integer
AS
BEGIN
-- Declare the return variable here
DECLARE #total_char int; DECLARE #position INT;
SET #total_char=0; set #position = 1;
-- Add the T-SQL statements to compute the return value here
if LEN(#string)>0
BEGIN
WHILE #position <= LEN(#string) -1
BEGIN
if SUBSTRING(#string, #position, 1) = #find_char
BEGIN
SET #total_char+= 1;
END
SET #position+= 1;
END
END;
-- Return the result of the function
RETURN #total_char;
END
GO

try this
declare #v varchar(250) = 'test.a,1 ;hheuw-20;'
-- LF ;
select len(replace(#v,';','11'))-len(#v)

If you want to count the number of instances of strings with more than a single character, you can either use the previous solution with regex, or this solution uses STRING_SPLIT, which I believe was introduced in SQL Server 2016. Also you’ll need compatibility level 130 and higher.
ALTER DATABASE [database_name] SET COMPATIBILITY_LEVEL = 130
.
--some data
DECLARE #table TABLE (col varchar(500))
INSERT INTO #table SELECT 'whaCHAR(10)teverCHAR(10)whateverCHAR(10)'
INSERT INTO #table SELECT 'whaCHAR(10)teverwhateverCHAR(10)'
INSERT INTO #table SELECT 'whaCHAR(10)teverCHAR(10)whateverCHAR(10)~'
--string to find
DECLARE #string varchar(100) = 'CHAR(10)'
--select
SELECT
col
, (SELECT COUNT(*) - 1 FROM STRING_SPLIT (REPLACE(REPLACE(col, '~', ''), 'CHAR(10)', '~'), '~')) AS 'NumberOfBreaks'
FROM #table

The second answer provided by nickf is very clever. However, it only works for a character length of the target sub-string of 1 and ignores spaces. Specifically, there were two leading spaces in my data, which SQL helpfully removes (I didn't know this) when all the characters on the right-hand-side are removed. Which meant that
" John Smith"
generated 12 using Nickf's method, whereas:
" Joe Bloggs, John Smith"
generated 10, and
" Joe Bloggs, John Smith, John Smith"
Generated 20.
I've therefore modified the solution slightly to the following, which works for me:
Select (len(replace(Sales_Reps,' ',''))- len(replace((replace(Sales_Reps, ' ','')),'JohnSmith','')))/9 as Count_JS
I'm sure someone can think of a better way of doing it!

You can also Try This
-- DECLARE field because your table type may be text
DECLARE #mmRxClaim nvarchar(MAX)
-- Getting Value from table
SELECT top (1) #mmRxClaim = mRxClaim FROM RxClaim WHERE rxclaimid_PK =362
-- Main String Value
SELECT #mmRxClaim AS MainStringValue
-- Count Multiple Character for this number of space will be number of character
SELECT LEN(#mmRxClaim) - LEN(REPLACE(#mmRxClaim, 'GS', ' ')) AS CountMultipleCharacter
-- Count Single Character for this number of space will be one
SELECT LEN(#mmRxClaim) - LEN(REPLACE(#mmRxClaim, 'G', '')) AS CountSingleCharacter
Output:

If you need to count the char in a string with more then 2 kinds of chars, you can use instead of 'n' - some operator or regex of the chars accept the char you need.
SELECT LEN(REPLACE(col, 'N', ''))

Try this:
SELECT COUNT(DECODE(SUBSTR(UPPER(:main_string),rownum,LENGTH(:search_char)),UPPER(:search_char),1)) search_char_count
FROM DUAL
connect by rownum <= length(:main_string);
It determines the number of single character occurrences as well as the sub-string occurrences in main string.

Here's what I used in Oracle SQL to see if someone was passing a correctly formatted phone number:
WHERE REPLACE(TRANSLATE('555-555-1212','0123456789-','00000000000'),'0','') IS NULL AND
LENGTH(REPLACE(TRANSLATE('555-555-1212','0123456789','0000000000'),'0','')) = 2
The first part checks to see if the phone number has only numbers and the hyphen and the second part checks to see that the phone number has only two hyphens.

for example to calculate the count instances of character (a) in SQL Column ->name is column name
'' ( and in doblequote's is empty i am replace a with nocharecter #'')
select len(name)- len(replace(name,'a','')) from TESTING
select len('YYNYNYYNNNYYNY')- len(replace('YYNYNYYNNNYYNY','y',''))

DECLARE #char NVARCHAR(50);
DECLARE #counter INT = 0;
DECLARE #i INT = 1;
DECLARE #search NVARCHAR(10) = 'Y'
SET #char = N'YYNYNYYNNNYYNY';
WHILE #i <= LEN(#char)
BEGIN
IF SUBSTRING(#char, #i, 1) = #search
SET #counter += 1;
SET #i += 1;
END;
SELECT #counter;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to identify and redact all instances of a matching pattern in T-SQL - sql

Related

SSMS replace all commas outside of quotation marks in string

Extract substring from string if certain characters exists SQL

T-SQL How to create function that compares string, checks difference, and do special function

Extract a number from String in SQL

How to count instances of character in SQL Column

Categories

Resources