Stripping out the text string before and after # symbol - sql

Am stuck here and would greatly appreciate any help!
R:£30 AT:63 RT:0 D .ADD £400 #63 WK
SQL Task:
1 - retrieve 400 (find symbol # and take as many characters going left until reached £ symbol)
2 - retrieve 63 (find # symbol and get as many characters until found " " or "W"

just use charIndex and substring. The example below is assuming that there can be £ after the # as well. Basically, splitting the string at #, for the 2nd part, i'm going from # to ' '. the first part, reserve it, find £, then reverse it back.
declare #col varchar(500)
set #col = 'R:£30 AT:63 RT:0 D .ADD £400 #63 WK'
declare #p1 varchar(500),#p2 varchar(500) --split col into 2 at #
set #p1 = (reverse(substring(#col,1,CHARINDEX('#',#col)-1))) -- i will reverse here
set #p2 = (substring(#col,CHARINDEX('#',#col)+1,LEN(#col)))
select #p1 p1, #p2 p2
,ltrim(rtrim(reverse(substring(#p1,1,CHARINDEX('£',#p1)-1)))) p1Final -- do the same thing as we did to p1 and reserse it
--also do a trim left and right to get rid of extra spaces
,ltrim(rtrim(SUBSTRING(#p2,1,charIndex(' ',#p2)))) p2Final --this one should be self explanatory if you get the first one :)

You can also use the patindex function that can look for a pattern.
declare #col varchar(50)
set #col='R:£30 AT:63 RT:0 D .ADD £400 #63 WK'
--400
select substring(#col,patindex('% £% #%',#col)+2,charindex('#',#col)-(patindex('% £% #%',#col)+3))
--63
select substring(#col,charindex('#',#col)+1,charindex(' ',reverse(#col))-1)

Not sure about the efficiency, but to get you started
For 1: I reversed the string and manipulated it and then reversed the result back
declare #s varchar(500) = 'R:£30 AT:63 RT:0 D .ADD £23 £400 #63 WK'
declare #sRev varchar(500) = REVERSE(#s)
declare #stemp varchar(500)
declare #ampIndRev int, #AfterAmpIndRev int
set #ampIndRev = CHARINDEX( '#',#sRev,1)
set #AfterAmpIndRev = charindex('£',#sRev, #ampIndRev)
set #stemp = SUBSTRING(#sRev, #ampIndRev + 1, #AfterAmpIndRev-#ampIndRev-1)
set #stemp = REVERSE(Ltrim(#stemp))
select #stemp
For 2 (I assumed that you need to look for W only if there is no space):
declare #s varchar(500) = 'R:£30 AT:63 RT:0 D .ADD £400 #63 WK'
declare #ampInd int, #AfterAmpInd int
set #ampInd = CHARINDEX( '#',#s,1)
set #AfterAmpInd = CHARINDEX(' ',#s,#ampInd)
if #AfterAmpInd = 0
set #AfterAmpInd = CHARINDEX('W',#s,#ampInd)
select SUBSTRING(#s, #ampInd + 1, #AfterAmpInd-#ampInd-1)

Related

sql server concating or replacing, which one is better (faster)

I have to generate a very long procedure every time for a reporting system, so i created a template for my procedure and replacing the parts are needed to, but i could do it with Concat or +(&)
for example:
set #query = '... and (
--#InnerQueries
)'
set #query = replace(#query,'--#InnerQueries',#otherValues)
vs
set #query += ' and exists (...)'
if(#xxx is not null)
set #query += 'and not exists (...)'
with replace approach it's more readable and maintainable for me, but for sake of optimization, what about Concat and attaching string together?
with replace: there are a lot of searching but less string creation
and with concat: lot's of string creation but no searching
so any idea?
I assume you're talking about using CONCAT or REPLACE to build an SQL then run it. If ultimately you'll process fewer than 100 REPLACEments, I'd go with that approach rather than CONCAT because it's more readable.
If however, you're talking about using concat/replace to create report output data and you will e.g. be carrying out 100 REPLACE operations per row on a million rows, I'd do the CONCAT route
update 2:
there could be something missing here:
if i change first variable :#sourceText_Replace
to a max value of 8000 character, and continue to add to it:
set #sourceText_Replace += '8000 character length'
set #sourceText_Replace +=#sourceText_Replace
set #sourceText_Replace +=#sourceText_Replace
set #sourceText_Replace +=#sourceText_Replace
set #sourceText_Replace +=#sourceText_Replace
set #sourceText_Replace +=#sourceText_Replace
set #sourceText_Replace +=#sourceText_Replace
it works fine, even if go up until: 16384017 character length
so any idea here is as good as mine
orginal answer:
to summarize (and if i didnt make any mistakes):
if you are searching in a long text, dont even think about using replace, it took seconds not milliseconds, but for concat obviously does not make any difference
in the blew code, in first try(small text), i just used variables default values and did not append to them,
but for second try(long Text) , i just append result from previous loop run
for long text, i did not bothered to run the loop more than 20 time, because it took over minutes.
smallText: set #destSmallText_Replace =
longText: set #destSmallText_Replace +=
here is the code for test:
SET NOCOUNT ON
drop table if exists #tempReplace
drop table if exists #tempConcat
create table #tempReplace
(
[txt] nvarchar(max) not null
)
create table #tempConcat
(
[txt] nvarchar(max) not null
)
declare #sourceText_Replace nvarchar(max) = 'small1 text to replace #textToBeReplaced after param text'
declare #text_Replace nvarchar(max) = #sourceText_Replace
declare #textToSearch nvarchar(max) = '#textToBeReplaced'
declare #textToReplace nvarchar(max) = 'textToBeReplaced'
declare #concat_Start nvarchar(max) = 'small1 text to replace'
declare #concat_End nvarchar(max) = 'after param text'
declare #text_Concat nvarchar(max) = #concat_Start
declare #whileCounter int =0
declare #maxCounter int = 5
declare #startTime datetime = getdate();
declare #endTime datetime = getdate();
begin
set #startTime = getDate();
while(#whileCounter <=#maxCounter)
begin
--long text
set #text_Replace += replace(#sourceText_Replace,#textToSearch,#textToReplace + convert(nvarchar(10), #whileCounter)) + #textToSearch
--small text
--set #text_Replace = replace(#sourceText_Replace,#textToSearch,#textToReplace + convert(nvarchar(10), #whileCounter)) + #textToSearch
--print #destSmallText_Replace
insert into #tempReplace values(#text_Replace)
set #whileCounter+=1
end
set #endTime = getDate();
print 'passedTime ' + Convert(nvarchar(20), DATEPART(millisecond, #endTime) - DATEPART(millisecond, #startTime))
end
begin
set #whileCounter = 0;
set #startTime = getDate();
while(#whileCounter <=#maxCounter)
begin
set #text_Concat += concat(#concat_Start,#textToReplace + convert(nvarchar(10), #whileCounter),#concat_End) + #textToSearch
--print #sourceSmallText_Concat
insert into #tempConcat values(#text_Concat)
set #whileCounter+=1
end
set #endTime = getDate();
print 'passedTime ' + Convert(nvarchar(20), DATEPART(millisecond, #endTime) - DATEPART(millisecond, #startTime))
end

Error Handling for numbers of delimiters when extracting substrings

Situation: I have a column where each cell can have up to 5 delimiters. However, it's possible that there are none.
Objective: How do i handle errors such as :
Invalid length parameter passed to the LEFT or SUBSTRING function.
in the case that it cannot find the specified delimiter.
Query:
declare #text VARCHAR(111) = 'abc-def-geeee-ifjf-zzz'
declare #start1 as int
declare #start2 as int
declare #start3 as int
declare #start4 as int
declare #start_index_reverse as int
set #start1 = CHARINDEX('-',#text,1)
set #start2 = CHARINDEX('-',#text,charindex('-',#text,1)+1)
set #start3 = CHARINDEX('-',#text,charindex('-',#text,CHARINDEX('-',#text,1)+1)+1)
set #start4 = CHARINDEX('-',#text,charindex('-',#text,CHARINDEX('-',#text,CHARINDEX('-',#text,1)+1)+1)+1)
set #start_index_reverse = CHARINDEX('-',REVERSE(#text),1)
select
LEFT(#text,#start1-1) AS Frst,
SUBSTRING(#text,#start1+1,#start2-#start1-1) AS Scnd,
SUBSTRING(#text,#start2+1,#start3-#start2-1) AS Third,
SUBSTRING(#text,#start3+1,#start4-#start3-1)AS Third,
RIGHT(#text,#start_index_reverse-1) AS Lst
In this case my variable includes 5 delimiters and so my query works but if i removed one '-' it would break.
XML support in SQL Server brings about some unintentional but useful tricks. Converting this string to XML allows for some parsing that is far less messy than native string handling, which is very far from awesome.
DECLARE #test varchar(111) = 'abc-def-ghi-jkl-mnop'; -- try also with 'abc-def'
;WITH n(x) AS
(
SELECT CONVERT(xml, '<x>' + REPLACE(#test, '-', '</x><x>') + '</x>')
)
SELECT
Frst = x.value('/x[1]','varchar(111)'),
Scnd = x.value('/x[2]','varchar(111)'),
Thrd = x.value('/x[3]','varchar(111)'),
Frth = x.value('/x[4]','varchar(111)'),
Ffth = x.value('/x[5]','varchar(111)')
FROM n;
For a table it's almost identical:
DECLARE #foo TABLE ( col varchar(111) );
INSERT #foo(col) VALUES('abc-def-ghi-jkl-mnop'),('abc'),('def-ghi');
;WITH n(x) AS
(
SELECT CONVERT(xml, '<x>' + REPLACE(col, '-', '</x><x>') + '</x>')
FROM #foo
)
SELECT
Frst = x.value('/x[1]','varchar(111)'),
Scnd = x.value('/x[2]','varchar(111)'),
Thrd = x.value('/x[3]','varchar(111)'),
Frth = x.value('/x[4]','varchar(111)'),
Ffth = x.value('/x[5]','varchar(111)')
FROM n;
Results (sorry about the massive size, seems this doesn't handle 144dpi well):
add a test before your last select
then you should decide how to handle the other case (when one of start is 0)
You can also refer to this link about splitting a string in sql server
which is uses a loop and can handle any number of delimiters
if #start1>0 and #start2>0 and #start3>0 and #start4>0
select LEFT(#text,#start1-1) AS Frst,
SUBSTRING(#text,#start1+1,#start2-#start1-1) AS Scnd,
SUBSTRING(#text,#start2+1,#start3-#start2-1) AS Third,
SUBSTRING(#text,#start3+1,#start4-#start3-1)AS Third,
RIGHT(#text,#start_index_reverse-1) AS Lst

update column to remove html tags

I am upgrading my db from one version of application to other. In this the later version does not store html tags but the previous one does.
I have sql function to remove html tags from one string :Best way to strip html tags from a string in sql server?
However I need to update all rows of one column. Can anyone please suggest some script so that all rows are updated removing the the html tags
UDF stands for "user defined function" - unless you did not define the the function with the name "udf_StripHTML" this simply won't work. I think you refer to this function:
CREATE FUNCTION [dbo].[udf_StripHTML]
(#HTMLText VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #Start INT
DECLARE #End INT
DECLARE #Length INT
SET #Start = CHARINDEX('<',#HTMLText)
SET #End = CHARINDEX('>',#HTMLText,CHARINDEX('<',#HTMLText))
SET #Length = (#End - #Start) + 1
WHILE #Start > 0
AND #End > 0
AND #Length > 0
BEGIN
SET #HTMLText = STUFF(#HTMLText,#Start,#Length,'')
SET #Start = CHARINDEX('<',#HTMLText)
SET #End = CHARINDEX('>',#HTMLText,CHARINDEX('<',#HTMLText))
SET #Length = (#End - #Start) + 1
END
RETURN LTRIM(RTRIM(#HTMLText))
END
GO
to test this function do:
SELECT dbo.udf_StripHTML('<b>UDF at stackoverflow.com </b><br><br>Stackoverflow.com')
Result Set:
UDF at stackoverflow.com Stackoverflow.com
This function was set up by Pinal Dave - see here.
Hope this helps.

How to separate date from a string?

Hi i have the string llike,
"on 01-15-09 witha factor of 0.8"
i wanted to seperate this string in the follwing way,
1] date as 01-15-09
2] Factor of 0.8
NOTE : String length is not fixed.
so how can we seperate the data in the form of #1 & #2 ?
To get the date you can use PATINDEX().
declare #yourString varchar(100)
set #yourString = 'on 01-15-09 with a factor of 0.8'
select substring(#yourString,
patindex('%[0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', #yourString),
8)
To get "factor of xx" you can do:
select substring(#yourString,
patindex('%with a%', #yourString) + 7,
20)
declare #txt varchar(max)
set #txt = 'on 01-15-09 witha factor of 0.8'
select cast(substring(#txt, patindex('% [0-9][1-9]-%', #txt), 9) as date) [date],
cast(right(#txt, patindex('%_ %', reverse(#txt))) as decimal(9,1)) Factor
Result:
date Factor
---------- ------
2009-01-15 0.8

How to remove links from text with SQL

I need to clean up a database by removing links from tables. So for column entry like this:
Thank you for the important information<br />Read More Here<br /> This is great.
i need to remove the entire link, so it would end up like this:
Thank you for the important information<br /><br /> This is great.
Is there a way to do this with a single UPDATE statement?
For extra credit, is there a way to remove the HTML semantics from the link, while leaving the content in the text?
Just try to find the starting and ending of the hrefj and replace it with a single space.
declare #StringToFix varchar(500)
set #StringToFix = 'Thank you for the important information<br /><a href="http://www.cnn.com">Read More'
select REPLACE(
#stringtofix
, Substring(#StringToFix
, CHARINDEX('<a href=', #StringToFix) -- Starting Point
-- End Point - Starting Point with 4 more spaces
, CHARINDEX('</a>', #StringToFix)
- CHARINDEX('<a href=', #StringToFix) +4 )
, ' '
) as ResultField
If all the links are done in a very consistent way than you can just use a regex replace of
'\<a href.*?\</a\>'
to an empty string.
I don't have SQL Server instance handy but the query in oracle would look something like:
update table
set col1 = REGEXP_REPLACE(col1,'\<a href.*?\</a\>', '', 1, 0, 'in');
I want share my sql script that remove ahref tag from text but leave anchor text.
Source text:
Visit Google, then Bing
Result text:
Visit Google, then Bing
MS SQL CODE:
declare #str nvarchar(max) = 'Visit Google, then Bing'
declare #aStart int = charindex('<a ', #str)
declare #aStartTagEnd int = charindex('>', #str, #aStart)
DECLARE #result nvarchar(max) = #str;
set #result = replace(#result, '</a>', '')
select #result
WHILE (#aStart > 0 and #aStartTagEnd > 0)
BEGIN
declare #rep1 nvarchar(max) = substring(#result, #aStart, #aStartTagEnd + 1 - #aStart)
set #result = replace(#result, #rep1, '')
set #aStart = charindex('<a ', #result)
set #aStartTagEnd = charindex('>', #result, #aStart)
END
select #result