SQL replace statement too slow - sql

I have a replace statement that does something like this:
SELECT Distinct Forenames, Surname, dbUSNs.DateOfBirth, Datasetname,
dbUSNs.MoPIGrade, SourceAddress, VRM, URNs
FROM Person
WHERE ( Replace(Replace(Replace(Replace(Replace(Replace(Replace
(Replace(Replace(Replace(Replace(Replace(Replace(Replace
(Replace(Replace(Replace(Replace(Replace(Replace(Replace
(Replace(Replace(Replace(Replace
(Surname,'/',''''),'?',''''),'',''''),'^',''''),'{',''''),'}',''''),
'[',''''),']',''''),';',''''),'$',''''),'=',''''),'*',''''),
'#',''''),'|',''''),'&',''''),'#',''''),'\',''''),'<',''''),
'>',''''),'(',''''),')',''''),'+',''''),',',''''),'.',''''),
' ','''') LIKE 'OREILLY%')
Therefore even though OReilly is passed, O'Reilly will be found. However, this is too slow. Is there a better way of approaching it?

The problem isn't that REPLACE is "too slow", but that using it at all makes that part of the query unsargable, meaning that it can't use an index.
Wikipedia: Sargable
Basically you've forced a tablescan / indexscan, from top to bottom. On top of that you have the overhead of REPLACE.
If you want this query to run fast, I would instead do one of the following:
Create an additional column containing a searchable text version of the Surname
Create an indexed, materialized view with those REPLACE functions

If you want to simply remove all special characters it's easier to specify the valid characters and use a function to perform the cleansing.
This shows you how to clean the string to alphanumeric characters and spaces '%[^a-z0-9 ]%'
DECLARE #Temp nvarchar(max) ='O''Rielly la/.das.d,as/.d,a/.da.sdo23eu89038 !£$$'
SELECT #Temp
DECLARE #KeepValues AS VARCHAR(50) = '%[^a-z0-9 ]%'
WHILE PatIndex(#KeepValues, #Temp) > 0
SET #Temp = Stuff(#Temp, PatIndex(#KeepValues, #Temp), 1, '')
SELECT #Temp
Which would return: ORielly ladasdasdadasdo23eu89038
So you can write a function:
CREATE FUNCTION [dbo].[RemoveNonAlphaCharacters](#Temp VARCHAR(1000))
RETURNS VARCHAR(1000)
AS
BEGIN
DECLARE #KeepValues AS VARCHAR(50) = '%[^a-z0-9 ]%'
WHILE PatIndex(#KeepValues, #Temp) > 0
SET #Temp = Stuff(#Temp, PatIndex(#KeepValues, #Temp), 1, '')
RETURN #Temp
END
Then simply call it like so:
SELECT *
FROM Person
WHERE [dbo].[RemoveNonAlphaCharacters](Surname) LIKE 'OREILLY%'
If you don't want spaces, just change it to: '%[^a-z0-9]%'

Try this:
Create a function to split:
create function [dbo].[Split](#String varchar(8000), #Delimiter char(1))
returns #temptable TABLE (items varchar(8000))
as
begin
declare #idx int
declare #slice varchar(8000)
select #idx = 1
if len(#String)<1 or #String is null return
while #idx!= 0
begin
set #idx = charindex(#Delimiter,#String)
if #idx!=0
set #slice = left(#String,#idx - 1)
else
set #slice = #String
if(len(#slice)>0)
insert into #temptable(Items) values(#slice)
set #String = right(#String,len(#String) - #idx)
if len(#String) = 0 break
end
return
end
Use this in you where clause:
WHERE ((REPLACE(Surname, items, '') FROM dbo.Split('/,?,^,{,},[,],;,$,=,*,#,|,&,#,\,<,>,(,),+,.')) LIKE 'OREILLY%')

General approach - yes.
Make another field (NameNormalized)
Run a trigger that sets the field whenever the name is updated.
Then you can run a search on that field (which can have an index).
Bascially the whole replace orgy makes the whole thing non-indexable, so the better approach is to store the normalized value and allow fast lookups.
Oh, and evaluate whether the distinct is needed - that is another hugh slowdown.

SELECT Distinct Forenames, Surname, dbUSNs.DateOfBirth, Datasetname,
dbUSNs.MoPIGrade, SourceAddress, VRM, URNs
FROM Person
WHERE Surname LIKE 'O[/?^{}[];$=*#|#\<>()+.]R[/?^{}[];$=*#|#\<>()+.]E[/?^{}[];$=*#|#\<>()+.]I[/?^{}[];$=*#|#\<>()+.]L[/?^{}[];$=*#|#\<>()+.]L[/?^{}[];$=*#|#\<>()+.]Y%')

if you want to remove all special characters just using SUB STRING and While
DECLARE #str VARCHAR(100),#Len INT,#Pos INT = 1,#char char(1),#results varchar(100)
SET #str = 'O''Rielly la/.das.d,as/.d,a/.da.sdo23eu89038 !£$$'
SET #Len = LEN(#str)
Set #results = ''
WHILE #Pos < #Len
BEGIN
SET #char = SUBSTRING(#str,#Pos,1)
IF #char like '[a-z0-9]' or #char = ' '
BEGIN
SET #results = #results + #char
END
SET #Pos = #Pos + 1
END
select #results

Related

I need help for a specific sql task

I have been given the following task: I have to write a stored procedure with two parameters: #Court int, #ReportId NVARCHAR(400) and i have to split the #ReportId parameter by space, convert each piece (of #ReportId) into int and use both this piece and the #Court parameter to perform an insert operation like this:
insert into RPT_Report2court (Reportid, courtnumber)
values (#ReportId, #Court)
for each piece (converted to int) of #ReportId parameter.
So far i have done the following:
SELECT CAST(value AS int)
FROM STRING_SPLIT(#ReportId, ' ')
but i really don't know how to iterate over these int values and take them to use them for the insert statements. (If the language is C# and not SQL i would put these int values in a list of ints and simply iterate over that list with foreach, but i don't know how to do that using sql).
I think this is what you want:
insert into RPT_Report2court (Reportid, courtnumber)
select cast(value AS int) , #Court
from string_split(#ReportId, ' ');
Create UserDefined function Split.
CREATE FUNCTION [dbo].[Split](#String varchar(8000), #Delimiter char(1))
returns #temptable TABLE (SplitValue varchar(8000))
as
begin
declare #idx int
declare #slice varchar(8000)
select #idx = 1
if len(#String)<1 or #String is null return
while #idx!= 0
begin
set #idx = charindex(#Delimiter,#String)
if #idx!=0
set #slice = left(#String,#idx - 1)
else
set #slice = #String
if(len(#slice)>0)
insert into #temptable(SplitValue) values(#slice)
set #String = right(#String,len(#String) - #idx)
if len(#String) = 0 break
end
return
end
insert into RPT_Report2court (Reportid, courtnumber)
select * ,#Court FROM dbo.Split(ReportId,' ')
Let me know if you have any query .
Thanks .

SSRS Report: get parameter data value and store it into variable in dataset

I have multi-valued parameter in my Report named #Animal which has ('Cat', 'Dog', 'Mouse').
inside dataset i need to get 'Cat', Dog', 'Mouse' and store it into #AnimalName table variable.
"Hard-Coded" way would be:
DECLARE #AnimalName TABLE (Name nvarchar (10))
INSERT INTO #AnimalName SELECT ('Cat');
INSERT INTO #AnimalName SELECT ('Dog');
INSERT INTO #AnimalName SELECT ('Mouse');
I know that I can use #Animal directly inside my dataset, the reason I'm doing this is because I'm trying to improve my report's performance. Many multi-valued parameters will make the report runs forever.
Does any one know how(the syntax) to get #Animal data values and stored it into a table variables #AnimalName inside dataset?
Thanks heaps!
Pass the comma delimited string into your stored procedure and in your stored proc use a table valued function to convert you multi-valued parameter into a table.
CREATE PROC GetAllAnimals
#AnimalList nvarchar(max)
AS
DECLARE #Animals TABLE (Animal nvarchar(10))
INSERT INTO #Animals SELECT * FROM dbo.fnGetValueListFromMultiSelect(#AnimalList)
and then use the #Animals table to inner join in your query
Functions declared below.
For Integer (or ID) values
CREATE FUNCTION [dbo].[fnGetIdListFromMultiSelect](#String nvarchar(MAX))
RETURNS #Results TABLE ([Id] int)
AS
BEGIN
DECLARE #Delimiter CHAR(1)
DECLARE #INDEX INT
DECLARE #SLICE nvarchar(4000)
IF #String IS NULL RETURN
SET #Delimiter = ','
SET #INDEX = 1
WHILE #INDEX !=0
BEGIN
-- GET THE INDEX OF THE FIRST OCCURENCE OF THE SPLIT CHARACTER
SELECT #INDEX = CHARINDEX(#Delimiter,#STRING)
-- NOW PUSH EVERYTHING TO THE LEFT OF IT INTO THE SLICE VARIABLE
IF #INDEX !=0
BEGIN
SELECT #SLICE = LEFT(#STRING,#INDEX - 1)
-- CHOP THE ITEM REMOVED OFF THE MAIN STRING
SELECT #STRING = RIGHT(#STRING,LEN(#STRING) - #INDEX)
END
ELSE
SELECT #SLICE = #STRING
-- PUT THE ITEM INTO THE RESULTS SET
INSERT INTO #Results([Id]) VALUES(CAST(#SLICE AS INT))
-- BREAK OUT IF WE ARE DONE
IF LEN(#STRING) = 0 BREAK
END
RETURN
END
For string values
CREATE FUNCTION [dbo].[fnGetValueListFromMultiSelect](#String nvarchar(MAX))
RETURNS #Results TABLE ([Item] nvarchar(128) Primary Key)
AS
BEGIN
DECLARE #Delimiter CHAR(1)
DECLARE #INDEX INT
DECLARE #SLICE nvarchar(4000)
SET #Delimiter = ','
SET #INDEX = 1
WHILE #INDEX !=0
BEGIN
-- GET THE INDEX OF THE FIRST OCCURENCE OF THE SPLIT CHARACTER
SELECT #INDEX = CHARINDEX(#Delimiter,#STRING)
-- NOW PUSH EVERYTHING TO THE LEFT OF IT INTO THE SLICE VARIABLE
IF #INDEX !=0
BEGIN
SELECT #SLICE = LEFT(#STRING,#INDEX - 1)
-- CHOP THE ITEM REMOVED OFF THE MAIN STRING
SELECT #STRING = RIGHT(#STRING,LEN(#STRING) - #INDEX)
END
ELSE
SELECT #SLICE = #STRING
-- PUT THE ITEM INTO THE RESULTS SET
INSERT INTO #Results([Item]) VALUES(#SLICE)
-- BREAK OUT IF WE ARE DONE
IF LEN(#STRING) = 0 BREAK
END
RETURN
END

SQL like for comma separated input

I want to write a stored procedure in which I want to run a query for multiple input which comes as a comma separated string. Just like we have in for exact match, can I have something like in too?
Input:
51094,51096,512584
Attempting to do:
select * from table where column like ('%51094%','%51096%','%512584%')
My query should iterate through each input and get the column which matches the pattern.
I have already tried following:
Contains(Column, '"*51094*" or "*51096*" or "*512584*")
But can't configure freetext search now.
Source: Is there a combination of "LIKE" and "IN" in SQL?
All the proposed types in: How to use SQL LIKE condition with multiple values in PostgreSQL?
None seems to be working.
Please suggest a simple way.
Try with first explode your input
$arr = explode($Input,",");
column like "%".$arr[0]."%" OR
column like "%".$arr[1]."%" OR
column like "%".$arr[2]."%"
This function you can use, no any mandatory to give comma only you can give special character.
ALTER function [dbo].[SplitString] (#String nvarchar(4000), #Delimiter char(1))
Returns #Results Table (Items nvarchar(50))
As
Begin
Declare #Index int
Declare #name nvarchar(20)
Declare #Slice nvarchar(50)
Select #Index = 1
If #String Is NULL Return
While #Index != 0
Begin
Select #Index = CharIndex(#Delimiter, #String)
If #Index <> 0
Select #Slice = left(#String, #Index - 1)
else
Select #Slice = #String
Insert into #Results(Items) Values (#Slice)
Select #String = right(#String, Len(#String) - #Index)
If Len(#String) = 0 break
End
Return
End
Looped the items and got it done.
Select * into #temp_inputIds from dbo.Split(#InputIds,',')
DECLARE #ID varchar (50)
DECLARE IDs CURSOR LOCAL FOR select items from #temp_inputIds
OPEN IDs
FETCH NEXT FROM IDs into #ID
WHILE ##FETCH_STATUS = 0
BEGIN
Select #SQL = 'Select component_id,'+#ID+' as pub_id from component_presentations where CONTENT like ''%' + #ID + '%'''
FETCH NEXT FROM IDs into #ID
END
CLOSE IDs
DEALLOCATE IDs

How to use IN Operator in SQL Server

How to use IN Operator in SQL Server
Here Is the table Structure
Create Table Sample(Id INT,Name Varchar(50))
While I am the Query like this I can get the Value
Select * FROM Sample WHERE Id IN ('74','77','79','80')
While I am executing the above Query I can't able to get the Records Related to that table getting error executing this error.
DECLARE #s VARCHAR(MAX)
SET #s='74','77','79','80'
Select * FROM Sample WHERE Id IN (#s)
You are using wrong way
use the following way
DECLARE #s VARCHAR(MAX)
DECLARE #d VARCHAR(MAX)
SET #s='74 , 77 , 79 , 80'
set #d = 'select * from arinvoice where arinvoiceid in('+#s+')'
exec (#d)
here IN operator use integers collection not string collection..
you should use a function which gives back a result set ( takes a csv format and returns a table)
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[Splitt] (#String NVARCHAR(4000),
#Delimiter CHAR(1))
RETURNS #Results TABLE (
Items NVARCHAR(4000))
AS
BEGIN
DECLARE #Index INT
DECLARE #Slice NVARCHAR(4000)
SELECT #Index = 1
IF #String IS NULL
RETURN
WHILE #Index != 0
BEGIN
SELECT #Index = Charindex(#Delimiter, #String)
IF #Index <> 0
SELECT #Slice = LEFT(#String, #Index - 1)
ELSE
SELECT #Slice = #String
IF ( NOT EXISTS (SELECT *
FROM #Results
WHERE items = #Slice) )
INSERT INTO #Results
(Items)
VALUES (#Slice)
SELECT #String = RIGHT(#String, Len(#String) - #Index)
IF Len(#String) = 0
BREAK
END
RETURN
END
and now you can write :
DECLARE #s VARCHAR(MAX)
SET #s='74,77,79,80'
Select * FROM Sample WHERE Id IN (select items from dbo.Splitt(#s,','))
If you are using ADO.NET, you can avoid the magic string, just use SqlDataRecord.
Or if you are using SQL Server 2008, you can also avoid the magic string by using Table-Valued Parameter
Source: http://www.sommarskog.se/arrays-in-sql-2008.html

How to split a string in T-SQL?

I have a varchar #a='a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p', which has | delimited values. I want to split this variable in a array or a table.
How can I do this?
Use a table valued function like this,
CREATE FUNCTION Splitfn(#String varchar(8000), #Delimiter char(1))
returns #temptable TABLE (items varchar(8000))
as
begin
declare #idx int
declare #slice varchar(8000)
select #idx = 1
if len(#String)<1 or #String is null return
while #idx!= 0
begin
set #idx = charindex(#Delimiter,#String)
if #idx!=0
set #slice = left(#String,#idx - 1)
else
set #slice = #String
if(len(#slice)>0)
insert into #temptable(Items) values(#slice)
set #String = right(#String,len(#String) - #idx)
if len(#String) = 0 break
end
return
end
and get your variable and use this function like this,
SELECT i.items FROM dbo.Splitfn(#a,'|') AS i
In general, this is such a common question here
I'll give the common answer: Arrays and Lists in SQL Server 2005 and Beyond by Erland Sommarskog
I'd recommend a table of numbers, not a loop, for general use.
Try this one:
declare #a varchar(10)
set #a = 'a|b|c|'
while len(#a) > 1
begin
insert into #temp
select substring(#a,1,patindex('%|%',#a)-1);
set #a = substring(#a,patindex('%|%',#a)+1,len(#a))
end;
Here's an alternative XML based solution. It seems to have a similar performance as the Splitfn() solution.
This converts varchar a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p into XML <a>a</a><a>b</a><a>c</a><a>d</a><a>e</a><a>f</a><a>g</a><a>h</a><a>i</a><a>j</a><a>k</a><a>l</a><a>m</a><a>n</a><a>o</a><a>p</a> and extracts the value from each XML <a> node.
declare #a varchar(max);
set #a = 'a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p';
declare #xml xml;
set #xml
= '<a>'+replace(replace(replace(#a,'&','&'),'<','<'),'|','</a><a>')+'</a>';
SELECT x.n.value('.','VARCHAR(1)') AS singleValue
FROM #xml.nodes('/a') AS x(n)
;