Script loses colour coding after very long string + string issue - sql

So, I have the following script:
declare #v varchar(max) = '', #x int = 0
WHILE #x<=100000
BEGIN
set #x += 1
set #v += cast(#x as varchar(9))
END
select len(#v), #x, #v,
len('{Insert #v result}')
select 1
Please replace {Insert #v result} with #v's result. Could not copy-paste because of character length.
My intention was to get an idea as to how long a list of IDs could get.
I have the following issues:
Query Editor loses colour coding from the line where you copy paste #v. This includes keywords, strings, comments, etc. is this normal?
When copy-pasting the result for #v I only get up to 10957, is this a flaw in my script or a clipboard limitation?

Related

sql Return string between two characters

I want to know a flexible way to extract the string between two '-'. The issue is that '-' may or may not exist in the source string. I have the following code which works fine when '-' exists twice in the source string, marking the start and end of the extracted string. But it throws an error "Invalid length parameter passed to the LEFT or SUBSTRING function" when there is only one '-' or none at all or if the string is blank. Can someone please help? Thanks
declare #string varchar(100) = 'BLAH90-ExtractThis-WOW'
SELECT SUBSTRING(#string,CHARINDEX('-',#string)+1, CHARINDEX('-',#string,CHARINDEX('-',#string)+1) -CHARINDEX('-',#string)-1) as My_String
Desired Output: ExtractThis
If there is one dash only e.g. 'BLAH90-ExtractThisWOW' then the output should be everything after the first dash i.e. ExtractThisWOW. If there are no dashes then the string will have a blank space instead e.g. 'BLAH90 ExtractThisWOW' and should return everything after the blank space i.e. ExtractThisWOW.
You can try something like this.
When there is no dash, it starts at the space if there is one or take the whole string if not.
Then I look if there is only one dash or 2
declare #string varchar(100) = 'BLAH90-ExtractThis-WOW'
declare #dash_pos integer = CHARINDEX('-',#string)
SELECT CASE
WHEN #dash_pos = 0 THEN
RIGHT(#string,LEN(#string)-CHARINDEX(' ',#string))
ELSE (
CASE
WHEN #dash_pos = LEN(#string)-CHARINDEX('-',REVERSE(#string))+1
THEN RIGHT(#string,LEN(#string)-#dash_pos)
ELSE SUBSTRING(#string,#dash_pos+1, CHARINDEX('-',#string,#dash_pos+1) -
#dash_pos -1)
END
)
END as My_String
Try this. If there are two dashes, it'll take what is inside. If there is only one or none, it'll keep the original string.
declare #string varchar(100) = 'BLAH-90ExtractThisWOW'
declare #dash_index1 int = case when #string like '%-%' then CHARINDEX('-', #string) else -1 end
declare #dash_index2 int = case when #string like '%-%'then len(#string) - CHARINDEX('-', reverse(#string)) + 1 else -1 end
SELECT case
when #dash_index1 <> #dash_index2 then SUBSTRING(#string,CHARINDEX('-',#string)+1, CHARINDEX('-',#string,CHARINDEX('-',#string)+1) -CHARINDEX('-',#string)-1)
else #string end
as My_String
Take your existing code:
declare #string varchar(100) = 'BLAH90-ExtractThis-WOW'
SELECT SUBSTRING(#string,CHARINDEX('-',#string)+1, CHARINDEX('-',#string,CHARINDEX('-',#string)+1) -CHARINDEX('-',#string)-1) as My_String
insert one line, like so:
declare #string varchar(100) = 'BLAH90-ExtractThis-WOW'
SET #string = #string + '--'
SELECT SUBSTRING(#string,CHARINDEX('-',#string)+1, CHARINDEX('-',#string,CHARINDEX('-',#string)+1) -CHARINDEX('-',#string)-1) as My_String
and you're done. (If NULL, you will get NULL returned. Also, this will return all data based on the FIRST dash found in the string, regardless of however many dashes are in the string.)

How to identify and redact all instances of a matching pattern in T-SQL

I have a requirement to run a function over certain fields to identify and redact any numbers which are 5 digits or longer, ensuring all but the last 4 digits are replaced with *
For example: "Some text with 12345 and 1234 and 12345678" would become "Some text with *2345 and 1234 and ****5678"
I've used PATINDEX to identify the the starting character of the pattern:
PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', TEST_TEXT)
I can recursively call that to get the starting character of all the occurrences, but I'm struggling with the actual redaction.
Does anyone have any pointers on how this can be done? I know to use REPLACE to insert the *s where they need to be, it's just the identification of what I should actually be replacing I'm struggling with.
Could do it on a program, but I need it to be T-SQL (can be a function if needed).
Any tips greatly appreciated!
You can do this using the built in functions of SQL Server. All of which used in this example are present in SQL Server 2008 and higher.
DECLARE #String VARCHAR(500) = 'Example Input: 1234567890, 1234, 12345, 123456, 1234567, 123asd456'
DECLARE #StartPos INT = 1, #EndPos INT = 1;
DECLARE #Input VARCHAR(500) = ISNULL(#String, '') + ' '; --Sets input field and adds a control character at the end to make the loop easier.
DECLARE #OutputString VARCHAR(500) = ''; --Initalize an empty string to avoid string null errors
WHILE (#StartPOS <> 0)
BEGIN
SET #StartPOS = PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #Input);
IF #StartPOS <> 0
BEGIN
SET #OutputString += SUBSTRING(#Input, 1, #StartPOS - 1); --Seperate all contents before the first occurance of our filter
SET #Input = SUBSTRING(#Input, #StartPOS, 500); --Cut the entire string to the end. Last value must be greater than the original string length to simply cut it all.
SET #EndPos = (PATINDEX('%[0-9][0-9][0-9][0-9][^0-9]%', #Input)); --First occurance of 4 numbers with a not number behind it.
SET #Input = STUFF(#Input, 1, (#EndPos - 1), REPLICATE('*', (#EndPos - 1))); --#EndPos - 1 gives us the amount of chars we want to replace.
END
END
SET #OutputString += #Input; --Append the last element
SET #OutputString = LEFT(#OutputString, LEN(#OutputString))
SELECT #OutputString;
Which outputs the following:
Example Input: ******7890, 1234, *2345, **3456, ***4567, 123asd456
This entire code could also be made as a function since it only requires an input text.
A dirty solution with recursive CTE
DECLARE
#tags nvarchar(max) = N'Some text with 12345 and 1234 and 12345678',
#c nchar(1) = N' ';
;
WITH Process (s, i)
as
(
SELECT #tags, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #tags)
UNION ALL
SELECT value, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', value)
FROM
(SELECT SUBSTRING(s,0,i)+'*'+SUBSTRING(s,i+4,len(s)) value
FROM Process
WHERE i >0) calc
-- we surround the value and the string with leading/trailing ,
-- so that cloth isn't a false positive for clothing
)
SELECT * FROM Process
WHERE i=0
I think a better solution it's to add clr function in Ms SQL Server to manage regexp.
sql-clr/RegEx
Here is an option using the DelimitedSplit8K_LEAD which can be found here. https://www.sqlservercentral.com/articles/reaping-the-benefits-of-the-window-functions-in-t-sql-2 This is an extension of Jeff Moden's splitter that is even a little bit faster than the original. The big advantage this splitter has over most of the others is that it returns the ordinal position of each element. One caveat to this is that I am using a space to split on based on your sample data. If you had numbers crammed in the middle of other characters this will ignore them. That may be good or bad depending on you specific requirements.
declare #Something varchar(100) = 'Some text with 12345 and 1234 and 12345678';
with MyCTE as
(
select x.ItemNumber
, Result = isnull(case when TRY_CONVERT(bigint, x.Item) is not null then isnull(replicate('*', len(convert(varchar(20), TRY_CONVERT(bigint, x.Item))) - 4), '') + right(convert(varchar(20), TRY_CONVERT(bigint, x.Item)), 4) end, x.Item)
from dbo.DelimitedSplit8K_LEAD(#Something, ' ') x
)
select Output = stuff((select ' ' + Result
from MyCTE
order by ItemNumber
FOR XML PATH('')), 1, 1, '')
This produces: Some text with *2345 and 1234 and ****5678

Trim Only the comma separated Numbers without trimming the character appended Numbers

I have a column '<InstructorID>' which may contain data like "79,instr1,inst2,13" and so on.
The following code gives me result like this "791213"
declare #InstructorID varchar(100)
set #InstructorID= (select InstructorID from CourseSession where CourseSessionNum=262)
WHILE PATINDEX('%[^0-9]%', #InstructorID) > 0
BEGIN
SET #InstructorID = STUFF(#InstructorID, PATINDEX('%[^0-9]%', #InstructorID), 1, '')
END
select #InstructorID
I need the output ti be like this "79,13"
i.e those numbers attached to characters shoud not appear in output.
P.S: I need to achieve this using sql only. Unfortunately i'm unable to use Regex which would have made this task much easier.
I agree with others that your problem would seem to be indicative of a mistake in your data design.
However, accepting that you cannot change the design, the following would allow you to achieve what you are looking for:
DECLARE #InstructorID VARCHAR(100)
DECLARE #Part VARCHAR(100)
DECLARE #Pos INT
DECLARE #Return VARCHAR(100)
SET #InstructorID = '79,instr1,inst2,13'
SET #Return = ''
-- Continue until InstructorID is empty
WHILE (LEN(#InstructorID) > 0)
BEGIN
-- Get the position of the next comma, and set to the end of InstructorID if there are no more
SET #Pos = CHARINDEX(',', #InstructorID)
IF (#Pos = 0)
SET #Pos = LEN(#InstructorID)
-- Get the next part of the text and shorted InstructorID
SET #Part = SUBSTRING(#InstructorID, 1, #Pos)
SET #InstructorID = RIGHT(#InstructorID, LEN(#InstructorID) - #Pos)
-- Check that the part is numeric
IF (ISNUMERIC(#Part) = 1)
SET #Return = #Return + #Part
END
-- Trim trailing comma (if any)
IF (RIGHT(#Return, 1) = ',')
SET #Return = LEFT(#Return, LEN(#Return) - 1)
PRINT #Return
Essentially, this loops through the #InstructorID, extracting parts of text between commas.
If the part is numeric then it adds it to the output text. I am PRINTing the text but you could SELECT it or use it however you wish.
Obviously, where I have SET #InstructorID = xyz, you should change this to your SELECT statement.
This code can be placed into a UDF if preferred, although as I say, your data format seems less than ideal.

Function returning 2 different results - T-SQL

I have used this site before for help with various things in the past, and in this instance, I couldn't find anything in the search box, so apologies if this exists elsewhere.
In sql server 2005, I have several stored procedures that change various bits of code, and recently we have created a function that adds spaces into a defined string. So in theory, I pass a string into it, and I get a result as blocks of 4. When I run this manually, and define the actual text, it splits fine (I get #### 0000 012 returned) but when I execute the function within the SP, I get #### 0012 0012. Is there any reason why?
I have set a print command to the string before it gets passed into my function, and it prints "####0000012 " and the print after is "#### 0012 0012"
Below is the function code, with no declares:
set ANSI_NULLS ON
set QUOTED_IDENTIFIER ON
GO
ALTER function [dbo].[udf_addspaces](#string varchar(255),#lengthbetween int)
returns varchar(100)
as
BEGIN
declare #i int, #stringlen float, #output varchar(255), #outputofloop varchar(4)
set #stringlen = LEN(#string)/#lengthbetween
set #output =''
set #i = 0
while #i <= #stringlen
BEGIN
set #outputofloop = left(#string,#lengthbetween)
if #lengthbetween < LEN(#string)
BEGIN
set #string = right(#string,LEN(#string)-#lengthbetween)
END
set #output = #output + #outputofloop +' '
set #i = #i+1
END
return #output
END
Here is the bit of the SP that executes this:
set #Consignment2 = (#Consignment) + rtrim(#Check14)
print #Consignment2
set #Consignment2 = dbo.udf_addspaces(#Consignment2,4)
print #Consignment2
Here are the lines it prints: (Note: #### replaces a 4 digit number, removed for security reasons)
####0000012
#### 0012 0012
Regards,
Luke M
Even though you've defined stringlen as a float, it will be an integer value, because the two values you're dividing are ints.
There's a difference between a char(14) mentioned in your comments, to a varchar(14). The char(14) is guaranteed to be 14 characters long. The varchar may not be.
I think the body of your function could be more succinctly expressed as this...
declare #result varchar(500)
select #result = ''
select
#result = #result
+ substring(#string, number*#lengthBetween+1, #lengthBetween)
+ ' '
from master..spt_values
where type='p'
and number <= (len(#string)/#lengthBetween)
return rtrim(#result)

Other approach for handling this TSQL text manipulation

I have this following data:
0297144600-4799 0297485500-5599
The 0297485500-5599 based on observation always on position 31 char from the left which this is an easy approach.
But I would like to do is to anticipate just in case the data is like this below which means the position is no longer valid:
0297144600-4799 0297485500-5599 0297485600-5699
As you can see, I guess the first approach will the split by 1 blank space (" ") but due to number of space is unknown (varies) how do I take this approach then? Is there any method to find the space in between and shrink into 1 blank space (" ").
BTW ... it needs to be done in TSQL (Ms SQL 2005) unfortunately cause it's for SSIS :(
I am open with your idea/suggestion.
Thanks
I have updated my answer a bit, now that I know the number pattern will not always match. This code assumes the sequences will begin and end with a number and be separated by any number of spaces.
DECLARE #input nvarchar -- max in parens
DECLARE #pattern nvarchar -- max in parens
DECLARE #answer nvarchar -- max in parens
DECLARE #pos int
SET #input = ' 0297144623423400-4799 5615618131201561561 0297485600-5699 '
-- Make sure our search string has whitespace at the end for our pattern to match
SET #input = #input + ' '
-- Find anything that starts and ends with a number
WHILE PATINDEX('%[0-9]%[0-9] %', #input) > 0
BEGIN
-- Trim off the leading whitespace
SET #input = LTRIM(#input)
-- Find the end of the sequence by finding a space
SET #pos = PATINDEX('% %', #input)
-- Get the result out now that we know where it is
SET #answer = SUBSTRING(#input, 0, #pos)
SELECT [Result] = #answer
-- Remove the result off the front of the string so we can continue parsing
SET #input = SUBSTRING(#input, LEN(#answer) + 1, 8096)
END
Assuming you're processing one line at a time, you can also try this:
DECLARE #InputString nvarchar(max)
SET #InputString = '0297144600-4799 0297485500-5599 0297485600-5699'
BEGIN
WHILE CHARINDEX(' ',#InputString) > 0 -- Checking for double spaces
SET #InputString =
REPLACE(#InputString,' ',' ') -- Replace 2 spaces with 1 space
END
PRINT #InputString
(taken directly from SQLUSA, fnRemoveMultipleSpaces1)