How can I find part of a string between two words? - sql

I want to get a part of text from my field description
Could someone offer some advice?
The whole string is 'Version100][BuildNumber:666][SubBuild:000]' and the build number is what I want to single out (however the number may change).
I have tried SUBSTRING with CHARINDEX but I can't seem to figure it out.
I've been googling for about 30 minutes and I can't seem to work it out.

little long, but you could do this.
DECLARE #Description VARCHAR(MAX)= '[Version100][BuildNumber:666][SubBuild:000]'
SELECT LEFT(STUFF(#Description, 1, PATINDEX('%BuildNumber%', #Description) + 11, '' )
,PATINDEX('%]%', STUFF(#Description, 1, PATINDEX('%BuildNumber%', #Description) + 11, '' )) - 1)

You can try this:
SELECT SUBSTRING([description],CHARINDEX('BuildNumber:',[description])+12,
CHARINDEX(']',[description], CHARINDEX('BuildNumber:',[description]))
-(CHARINDEX('BuildNumber:',[description])+12))
FROM YOURTABLE

I haven't actually tested this (numeric offsets might be a bit off, but you can tweak them), but assuming the string formats are always the same, then the following should work, irrespective of the number of digits in the build number.
SELECT SUBSTRING(description,
CHARINDEX('BuildNumber', description) + 11, --The position after BuildNumber: (a)
CHARINDEX('][SubBuild', description) - 3 - CHARINDEX('BuildNumber', description) + 11)) --The distance from (a) to the square brackets before SubBuild

Hope this query can return your expected result with out hard-coding the build number count:
DECLARE #Description AS VARCHAR (500) = 'Version100][BuildNumber:6663211][SubBuild:000]';
DECLARE #BeforeString AS VARCHAR (100) = '[BuildNumber:';
DECLARE #BeforeStringPosition AS INT = CHARINDEX(#BeforeString, #Description);
SELECT SUBSTRING(#Description, #BeforeStringPosition + LEN(#BeforeString) , CHARINDEX('][SubBuild', #Description) - #BeforeStringPosition - LEN(#BeforeString));

Related

in SQL how can I remove the first 3 characters on the left and everything on the right after an specific character

In SQL how can I remove (from displaying on my report no deleting from database) the first 3 characters (CN=) and everything after the comma that is followed by "OU" so that I am left with the name and last name in the same column? for example:
CN=Tom Chess,OU=records,DC=1234564786_data for testing, 1234567
CN=Jack Bauer,OU=records,DC=1234564786_data for testing, 1234567
CN=John Snow,OU=records,DC=1234564786_data for testing, 1234567
CN=Anna Rodriguez,OU=records,DC=1234564786_data for testing, 1234567
Desired display:
Tom Chess
Jack Bauer
John Snow
Anna Rodriguez
I tried playing with TRIM but I don't know how to do it without declaring the position and with names and last names having different lengths I really don't know how to handle that.
Thank you in advance
Update: I wonder about an approach of using Locate to match the position of the comma and then feed that to a sub-string. Not sure if a approach like would work and not sure how to put the syntax together. What do you think? will it be a feasible approach?
You can try this one SUBSTRING(ColumnName, 4, CHARINDEX(',', ColumnName) - 4)
In Postgres, you could use split_part() assuming no name contains a ,
select substr(split_part(the_column, ',', 1), 4)
from ...
Db2 11.x for LUW:
with tab (str) as (values
' CN = Tom Chess , OU = records,DC=1234564786_data for testing, 1234567'
, 'CN=Jack Bauer,OU=records,DC=1234564786_data for testing, 1234567'
, 'CN=John Snow,OU=records,DC=1234564786_data for testing, 1234567'
, 'CN=Anna Rodriguez,OU=records,DC=1234564786_data for testing, 1234567'
)
select REGEXP_REPLACE(str, '^\s*CN\s*=\s*(.*)\s*,\s*OU\s*=.*', '\1')
from tab;
Note, that such a regex pattern allows an arbitrary number of spaces as in the 1-st record of example above.
In Oracle 11g, it might work.
REGEXP_SUBSTR(REGEXP_SUBSTR(COLUMN_NAME, '[^CN=]+',1,1),'[^,OU]+',1,1)
I think there has to be a loop to handle this. Here's SQL Server function that will parse this out. (I know the question didn't specify SQL Server, but it's an example of how it can be done.)
select dbo.ScrubFieldValue(value) from table will return what you're looking for
CREATE FUNCTION ScrubFieldValue
(
#Input varchar(8000)
)
RETURNS varchar(8000)
AS
BEGIN
DECLARE #retval varchar(8000)
DECLARE #charidx int
DECLARE #remaining varchar(8000)
DECLARE #current varchar(8000)
DECLARE #currentLength int
select #retval = ''
select #remaining = #Input
select #charidx = CHARINDEX('CN=', #remaining,2)
while(LEN(#remaining) > 0)
BEGIN
--strip current row from remaining
if (#charidx > 0)
BEGIN
select #current = SUBSTRING(#remaining, 1, #charidx - 1)
END
else
BEGIN
select #current = #remaining
END
select #currentLength = LEN(#current)
-- get current name
select #current = SUBSTRING(#current, 4, CHARINDEX(',OU', #current)-4)
select #retval = #retval + #current + ' '
-- strip off current from remaining
select #remaining =substring(#remaining,#currentLength + 1,
LEN(#remaining) - #currentLength)
select #charidx = CHARINDEX('CN=', #remaining,2)
END
RETURN #retval
END
On my version of DB2 for Z/OS CHARINDEX throws a syntax error. Here are two ways to work around that.
SUBSTRING(ColumnName, 4, INSTR(ColumnName,',',1) - 4)
SUBSTRING(ColumnName, 4, LOCATE_IN_STRING(ColumnName,',') - 4)
I should add that the version is V12R1
If input str is wellformed (i.e. looks like your sample data without any additional tokens such as space), you could use something like:
substr(str,locate('CN=', str)+length('CN='), locate(',', str)-length('CN=')-1)
If your Db2 version support REGEXP, that's a better choice.

Joining on numeric part of string

It's been a while...I'd like to get your advice on the most efficient way to join on only the number part of a field that may be prefixed and/or suffixed with up to 2 letters. Here's a simplified snippet of what I'm trying to do:
SELECT a, b, c
FROM table 1 t1
LEFT JOIN table 2 t2 ON t1.PolicyCode = t2.sPolicyID,
Where t2.sPolicyID could begin and/or end with up to 2 letters. Some examples: TG73100, S7286674, 2344506R, etc. We only want to join to just its numeric part in between the letters, i.e. 73100, 7286674 or 2344506 from the examples.
Could someone please advise on a simple way of doing this?
Here is one way:
LEFT JOIN table 2 t2 ON t1.PolicyCode =
LEFT(SUBSTRING(t2.sPolicyID, PATINDEX('%[0-9]%', t2.sPolicyID), 50),
PATINDEX('%[^0-9]%',
SUBSTRING(t2.sPolicyID, PATINDEX('%[0-9]%', t2.sPolicyID), 50)
+ 'a') -1)
To break this down, there are 4 main parts.
1: Find the position of the first number with PATINDEX:
DECLARE #spolicyID VARCHAR(20) = 'xx123123xx'
SELECT PATINDEX('%[0-9]%', #spolicyID)
--Returns 3
2: Use SUBSTRING() to cut off everything before the first letter:
DECLARE #spolicyID VARCHAR(20) = 'xx123123xx'
SELECT SUBSTRING(#spolicyID, PATINDEX('%[0-9]%', #spolicyID), 50)
--Returns 123123xx
If we hardcoded the 3 that we know is returned from the first part, it would look like this:
DECLARE #spolicyID VARCHAR(20) = 'xx123123xx'
SELECT SUBSTRING(#spolicyID, 3), 50)
--50 is the number of characters to extract, set to something
--higher than the max string length to be safe
Of course, we don't want to hardcode it since it can change, but that makes seeing the different functions a bit easier.
3: Find the position of the next letter using PATINDEX again:
DECLARE #spolicyID VARCHAR(20) = 'xx123123xx'
SELECT PATINDEX('%[^0-9]%', SUBSTRING(#spolicyID, PATINDEX('%[0-9]%', #spolicyID), 50) + 'a')
--Returns 7 since it is looking at 123123xx
--The first x is in the 7th position
Note that we added an a onto the string. This is because if we had a string with no letters at the end, it would throw an error as the length 0 would be returned to SUBSTRING. You could add any letter or letters to the end and it would work, we are just making sure there is at least one. Try removing the + 'a' and using a string like xx123123 to see the error.
If we hardcoded the 123123xx from step 2 it would look like this (again just for easy example):
DECLARE #spolicyID VARCHAR(20) = 'xx123123xx'
SELECT PATINDEX('%[^0-9]%', '123123xx' + 'a')
4: Use LEFT() to return everything before the trailing letters, leaving us with only the numbers in between:
DECLARE #spolicyID VARCHAR(20) = 'xx123123xx'
LEFT(SUBSTRING(#spolicyID, PATINDEX('%[0-9]%', #spolicyID), 50),PATINDEX('%[^0-9]%', SUBSTRING(#spolicyID, PATINDEX('%[0-9]%', #spolicyID), 50) + 'a') -1)
--Need to add `-1` because step 3 PATINDEX returns 7
--as the position of first trailing letter, and
--we want the 6 characters before that
And again hardcoded from step 2 and 3 for easy viewing:
DECLARE #spolicyID VARCHAR(20) = 'xx123123xx'
LEFT('123123xx', 7-1)

Need to pad zeros left and right for a string value according to decimal format

So if I have a data (varchar) like say 10.1
I need the value as 0000101000000.
means (000010) whole number and (1000000) decimal value.
Its a 13 character string ,numbers coming before decimal point should be in first 6 characters and numbers coming after decimal point should be in last 7 characters
Maybe..?
DECLARE #d decimal(13,7) = 10.1;
SELECT RIGHT('0000000000000' + CONVERT(varchar(13),CONVERT(bigint,(#d * 10000000))),13);
Using my crystal ball here though.
Edit: As, for some reason, the OP is storing a decimal as a varchar (this is a really bad bad idea on it's own), I have added further logic to attempt to convert the value to a decimal first.
As experience has taught many of us, give a user a non-numeric column to store a numeric value in and they're more than happily store a non-numeric value in it, so i have used TRY_CONVERT and assumed you are using SQL Server 2012+:
DECLARE #d varchar(13) = 10.1;
SELECT RIGHT('0000000000000' + CONVERT(varchar(13),CONVERT(bigint,(TRY_CONVERT(decimal(13,7),#d) * 10000000))),13);
SELECT REPLICATE('0',6-LEN(SUBSTRING(CAST([data] AS VARCHAR), 1,
CHARINDEX('.',CAST([data] AS VARCHAR)) -1)))+SUBSTRING(CAST([data] AS VARCHAR), 1,
CHARINDEX('.',CAST([data] AS VARCHAR)) -1)+
SUBSTRING(CAST([data] AS VARCHAR), CHARINDEX('.',CAST([data] AS VARCHAR)) + 1,
LEN(CAST([data] AS VARCHAR)))+REPLICATE('0',7-LEN(SUBSTRING(CAST([data] AS VARCHAR), CHARINDEX('.',CAST([data] AS VARCHAR)) + 1,
LEN(CAST([data] AS VARCHAR))))) AS Whole
FROM Table1
Output
Whole
0000101000000
Demo
http://sqlfiddle.com/#!18/8649d/16
You can use some math and string operations to do it like below
see live demo
declare #var decimal(10,4)
set #var=10.1
select #var,
right(cast(cast(( floor(#var)+ power(10,7)) as int) as varchar(13)),6)
+
cast(cast(((#var- floor(#var)) * power(10,7)) as int) as varchar(13))
There's a fair amount of string manipulation to be done here. I'll step through what I did.
I used a variable for the base number so I could verify different results:
declare #n decimal(9,3) = 10.1
You need 6 spaces left of the decimal and 7 spaces to the right, so I'm doing all the manipulation on a VARCHAR(13). I didn't create a new variable as a VARCHAR because I'm assuming you want to be able to do this conversion in line on the fly, so I'm using that CAST over and over again.
Start by finding the decimal place.
SELECT CHARINDEX('.',CAST(#n as VARCHAR(13)))
In the sample number, that's a 3, but it could obviously change.
Now, get the portion of the number to the left of the decimal place.
SELECT SUBSTRING(CAST(#n as VARCHAR(13)),1,CHARINDEX('.',CAST(#n as VARCHAR(13)))-1)
Then get the portion to the right of the decimal.
SELECT SUBSTRING(CAST(#n as VARCHAR(13)),CHARINDEX('.',CAST(#n as VARCHAR(13)))+1,LEN(CAST(#n as VARCHAR(13))))
Pad the leading zeroes. Put 6 on, concatenate, and take a RIGHT 6. Accounts for no digits to the left of the decimal.
SELECT RIGHT(REPLICATE(0,6) + SUBSTRING(CAST(#n as VARCHAR(13)),1,CHARINDEX('.',CAST(#n as VARCHAR(13)))-1), 6)
Pad the trailing zeroes. Same idea, but in the other direction.
SELECT LEFT(SUBSTRING(CAST(#n as VARCHAR(13)),CHARINDEX('.',CAST(#n as VARCHAR(13)))+1,LEN(CAST(#n as VARCHAR(13)))) + REPLICATE(0,7),7)
Then put it all together.
SELECT RIGHT(REPLICATE(0,6) + SUBSTRING(CAST(#n as VARCHAR(13)),1,CHARINDEX('.',CAST(#n as VARCHAR(13)))-1), 6)
+
LEFT(SUBSTRING(CAST(#n as VARCHAR(13)),CHARINDEX('.',CAST(#n as VARCHAR(13)))+1,LEN(CAST(#n as VARCHAR(13)))) + REPLICATE(0,7),7)
Results.
0000101000000
declare #var varchar(20) = '10000.112'
SELECT FORMAT (FLOOR(#var), '000000') + left((PARSENAME(#var,1)) + replicate('0',7),7)

Extract substring from string if certain characters exists SQL

I have a string:
DECLARE #UserComment AS VARCHAR(1000) = 'bjones marked inspection on system UP for site COL01545 as Refused to COD won''t pay upfront :Routeid: 12 :Inspectionid: 55274'
Is there a way for me to extract everything from the string after 'Inspectionid: ' leaving me just the InspectionID to save into a variable?
Your example doesn't quite work correctly. You defined your variable as varchar(100) but there are more characters in your string than that.
This should work based on your sample data.
DECLARE #UserComment AS VARCHAR(1000) = 'bjones marked inspection on system UP for site COL01545 as Refused to COD won''t pay upfront :Routeid: 12 :Inspectionid: 55274'
select right(#UserComment, case when charindex('Inspectionid: ', #UserComment, 0) > 0 then len(#UserComment) - charindex('Inspectionid: ', #UserComment, 0) - 13 else len(#UserComment) end)
I would do this as:
select stuff(#UserComment, 1, charindex(':Inspectionid: ', #UserComment) + 14, '')
This works even if the string is not found -- although it will return the whole string. To get an empty string in this case:
select stuff(#UserComment, 1, charindex(':Inspectionid: ', #UserComment + ':Inspectionid: ') + 14, '')
Firstly, let me say that your #UserComment variable is not long enough to contain the text you're putting into it. Increase the size of that first.
The SQL below will extract the value:
DECLARE #UserComment AS VARCHAR(1000); SET #UserComment = 'bjones marked inspection on system UP for site COL01545 as Refused to COD won''t pay upfront :Routeid: 12 :Inspectionid: 55274'
DECLARE #pos int
DECLARE #InspectionId int
DECLARE #IdToFind varchar(100)
SET #IdToFind = 'Inspectionid: '
SET #pos = CHARINDEX(#IdToFind, #UserComment)
IF #pos > 0
BEGIN
SET #InspectionId = CAST(SUBSTRING(#UserComment, #pos+LEN(#IdToFind)+1, (LEN(#UserComment) - #pos) + 1) AS INT)
PRINT #InspectionId
END
You could make the above code into a SQL function if necessary.
If the Inspection ID is always 5 digits then the last argument for the Substring function (length) can be 5, i.e.
SELECT SUBSTRING(#UserComment,PATINDEX('%Inspectionid:%',#UserComment)+14,5)
If the Inspection ID varies (but is always at the end - which your question slightly implies), then the last argument can be derived by subtracting the position of 'InspectionID:' from the overall length of the string. Like this:
SELECT SUBSTRING(#UserComment,PATINDEX('%Inspectionid:%',#UserComment)+14,LEN(#usercomment)-(PATINDEX('%Inspectionid:%',#UserComment)+13))

Simple Explanation for PATINDEX

I have have been reading up on PATINDEX attempting to understand what and why. I understand the when using the wildcards it will return an INT as to where that character(s) appears/starts. So:
SELECT PATINDEX('%b%', '123b') -- returns 4
However I am looking to see if someone can explain the reason as to why you would use this in a simple(ish) way. I have read some other forums but it just is not sinking in to be honest.
Are you asking for realistic use-cases? I can think of two, real-life use-cases that I've had at work where PATINDEX() was my best option.
I had to import a text-file and parse it for INSERT INTO later on. But these files sometimes had numbers in this format: 00000-59. If you try CAST('00000-59' AS INT) you'll get an error. So I needed code that would parse 00000-59 to -59 but also 00000159 to 159 etc. The - could be anywhere, or it could simply not be there at all. This is what I did:
DECLARE #my_var VARCHAR(255) = '00000-59', #my_int INT
SET #my_var = STUFF(#my_var, 1, PATINDEX('%[^0]%', #my_var)-1, '')
SET #my_int = CAST(#my_var AS INT)
[^0] in this case means "any character that isn't a 0". So PATINDEX() tells me when the 0's end, regardless of whether that's because of a - or a number.
The second use-case I've had was checking whether an IBAN number was correct. In order to do that, any letters in the IBAN need to be changed to a corresponding number (A=10, B=11, etc...). I did something like this (incomplete but you get the idea):
SET #i = PATINDEX('%[^0-9]%', #IBAN)
WHILE #i <> 0 BEGIN
SET #num = UNICODE(SUBSTRING(#IBAN, #i, 1))-55
SET #IBAN = STUFF(#IBAN, #i, 1, CAST(#num AS VARCHAR(2))
SET #i = PATINDEX('%[^0-9]%', #IBAN)
END
So again, I'm not concerned with finding exactly the letter A or B etc. I'm just finding anything that isn't a number and converting it.
PATINDEX is roughly equivalent to CHARINDEX except that it returns the position of a pattern instead of single character. Examples:
Check if a string contains at least one digit:
SELECT PATINDEX('%[0-9]%', 'Hello') -- 0
SELECT PATINDEX('%[0-9]%', 'H3110') -- 2
Extract numeric portion from a string:
SELECT SUBSTRING('12345', PATINDEX('%[0-9]%', '12345'), 100) -- 12345
SELECT SUBSTRING('x2345', PATINDEX('%[0-9]%', 'x2345'), 100) -- 2345
SELECT SUBSTRING('xx345', PATINDEX('%[0-9]%', 'xx345'), 100) -- 345
Quoted from PATINDEX (Transact-SQL)
The following example uses % and _ wildcards to find the position at
which the pattern 'en', followed by any one character and 'ure' starts
in the specified string (index starts at 1):
SELECT PATINDEX('%en_ure%', 'please ensure the door is locked');
Here is the result set.
8
You'd use the PATINDEX function when you want to know at which character position a pattern begins in an expression of a valid text or character data type.