TSQL stripping number out of a string - sql

I have a string as below and assume "." always exist in the string.
'NAB 12345 Tom Heading abcde#yahoo.com.au Web 20294821. Australia Regular Post'
How do I strip 20294821 from the above string using TSQL?
I tried the below but it only works if the number is the last word in the string
REPLACE(REVERSE( LEFT( REVERSE(Comments), CHARINDEX(' ', REVERSE(Comments))-1 ) ) ,'.','')
-Alan-

Also you can as the below:
DECLARE #Comments as VARCHAR(255) = 'NAB 12345 Tom Heading abcde#yahoo.com.au Web 20294821. Australia Regular Post'
SELECT REPLACE(#Comments, LEFT(RIGHT(#Comments, LEN(#Comments) - CHARINDEX('Web ', #Comments, 0) - 3), CHARINDEX('.', RIGHT(#Comments, LEN(#Comments) - CHARINDEX('Web ', #Comments, 0) - 3), 0) - 1), '')

As the possible string pattern is not certain, here I'm putting this answer based on the assumption that you want to remove the string between the rightmost dot . and the nearest space on the left of the dot.
DECLARE #Comments VARCHAR (MAX)
SET #Comments = 'NAB 12345 Tom Heading abcde#yahoo.com.au Web 20294821. Australia Regular Post'
DECLARE #Comments_TrimmedContent VARCHAR (MAX)
DECLARE #Comments_TrimmedAfterDot VARCHAR (MAX)
SELECT #Comments_TrimmedContent = REVERSE( LEFT( REVERSE(#Comments), CHARINDEX('.', REVERSE(#Comments)) ))
SELECT #Comments_TrimmedAfterDot = REVERSE( RIGHT( REVERSE(#Comments), LEN(#Comments) - CHARINDEX('.', REVERSE(#Comments)) - 1 ))
SELECT REVERSE( RIGHT ( REVERSE(#Comments_TrimmedAfterDot), LEN(#Comments_TrimmedAfterDot) - CHARINDEX(' ', REVERSE(#Comments_TrimmedAfterDot)))) + #Comments_TrimmedContent
Output:
NAB 12345 Tom Heading abcde#yahoo.com.au Web. Australia Regular Post

The final line has the final answer but I've broken down the steps to make things a bit more coherent.
It returns the value between the first Web space and the full stop.
DECLARE #Comments as VARCHAR(255)
SET #Comments = 'NAB 12345 Tom Heading abcde#yahoo.com.au Web 20294821. Australia Regular Post'
SELECT #Comments
-- Assume there is only one occurence of web and a space
SELECT CHARINDEX('Web ', #Comments)
--Show the substring starting from the number (skip for 4 characters for web and a space)
SELECT SUBSTRING(#Comments, CHARINDEX('Web ', #Comments) + 4, LEN(#Comments))
--Find the full stop after the web space
SELECT CHARINDEX('.', SUBSTRING(#Comments, CHARINDEX('Web ', #Comments) + 4, LEN(#Comments)))
--Combine all of the above logic to give the answer
SELECT SUBSTRING(#Comments, CHARINDEX('Web ', #Comments) + 4, CHARINDEX('.', SUBSTRING(#Comments, CHARINDEX('Web ', #Comments) + 4, LEN(#Comments)))-1)

Related

Split name into multiple parts in SELECT statement

I cannot seem to find an existing post on splitting a string into the parts I require. I have a database field in SQL Server that contains the "LastName FirstName MI" (no commas just spaces delimiting each part of a person's name). I have the following SQL to get the FirstName and Last, but cannot figure out how to get the Middle Initial or Middle Name.
Ex. Doe John B
SELECT
RTRIM(LEFT([PATIENT_NAME], CHARINDEX(' ', [PATIENT_NAME]))) AS LastName,
SUBSTRING([PATIENT_NAME], CHARINDEX(' ', [PATIENT_NAME]) + 1, LEN([PATIENT_NAME])) AS FirstName
FROM
Clients
Results in:
FirstName = John B
LastName = Doe
How to just return the first name without the middle initial and get the 'B' as middle name from this string in this SELECT statement?
You can either take the right 1 character, or reverse the string the take the first char.
SELECT RIGHT(LTRIM(RTRIM([Patient_Name])), 1) AS Middle_Initial
SELECT LEFT(REVERSE(LTRIM(RTRIM([Patient_Name]))), 1) AS Middle_Initial
As for removing MI from your firstname string, I would either find the length of the string and take the left N-2 chars or I would charindex the space and then take that many chars. To put it all together:
DECLARE #name VARCHAR(100) = 'Smith David M '
--Clean the string of leading/trailing whitespace
SELECT LTRIM(RTRIM(#name)) AS name_cleaned
--Find the first space to parse out the last name
SELECT CHARINDEX(' ', #name) AS first_space
--Select all chars before the first space
SELECT LEFT(LTRIM(RTRIM(#name)), CHARINDEX(' ', #name)-1) AS last_name
--Find the next space, use the starting location as the previous space and add 1
SELECT CHARINDEX(' ', #name, 7) AS second_space
--Select all chars between the spaces
SELECT SUBSTRING(#name, CHARINDEX(' ', #name)+1, CHARINDEX(' ', #name, 7) - CHARINDEX(' ', #name)) AS first_name
--Select the right most char for middle initial
SELECT RIGHT(LTRIM(RTRIM(#name)), 1) AS middle
You can REPLACE the space characters with period characters (.) and use PARSENAME().
Note that this would work for all 3 parts of the name, not just the middle initial.
When using the CHARINDEX on the last name, you'll use it as the length of the substring. Then, on the FirstName, use it again as start position on the substring. Now, the trick on the Middle, on the CHARINDEX, you have to include the start position which will be the LEN minus the LastName CHARINDEX. this would gives you the second space which is the position you want to start with for taking the Middle Name.
See the example below :
DECLARE #tb TABLE (PATIENT_NAME varchar(250));
INSERT INTO #tb VALUES
('Doe John B')
DECLARE
#LastName INT
, #Middle INT
SELECT
#LastName = CHARINDEX(' ', PATIENT_NAME)
, #Middle = CHARINDEX(' ', PATIENT_NAME, LEN(PATIENT_NAME) - CHARINDEX(' ', PATIENT_NAME))
FROM #tb
SELECT
SUBSTRING(PATIENT_NAME, 1, #LastName) LastName
, SUBSTRING(PATIENT_NAME, #LastName, LEN(PATIENT_NAME) - #LastName) FirstName
, SUBSTRING(PATIENT_NAME, #Middle, LEN(PATIENT_NAME) - #Middle + 1 ) Middle
FROM #tb
I have declared some variables to make things much readable, but you can do it without them.
Surely, LEFT and RIGHT are the easier approaches on taking the lastname and Middle Name. Along with using some helper functions such as REVERSE and TRIM, but I would prefer PARSENAME as a simpler and cleaner approach.
Here is an example :
SELECT
PARSENAME(REPLACE(PATIENT_NAME,' ','.'),3) LastName
, PARSENAME(REPLACE(PATIENT_NAME,' ','.'),2) FirstName
, PARSENAME(REPLACE(PATIENT_NAME,' ','.'),1) Middle
Since the number of elements you must extract from your string is fixed(3) you can use XML based split:
DECLARE #clients TABLE (PATIENT_NAME nvarchar(max));
INSERT INTO #clients VALUES
(' Doe John B ')
,(' Doe Jane C ')
,(' Doe Jill ')
;WITH Splitted
AS (
SELECT PATIENT_NAME as ORIGINAL_PATIENT_NAME
,REPLACE(REPLACE(REPLACE(ltrim(rtrim(PATIENT_NAME)),' ','<>'),'><',''),'<>',' ') as PATIENT_NAME
,CAST('<x>' + REPLACE(REPLACE(REPLACE(REPLACE(ltrim(rtrim(PATIENT_NAME)),' ','<>'),'><',''),'<>',' '), ' ', '</x><x>') + '</x>' AS XML) AS Parts
FROM #clients
)
SELECT
ORIGINAL_PATIENT_NAME
,PATIENT_NAME
,Parts.value(N'/x[1]', 'nvarchar(max)') AS LAST_NAME
,Parts.value(N'/x[2]', 'nvarchar(max)') AS FIRST_NAME
,Parts.value(N'/x[3]', 'nvarchar(max)') AS MIDDLE_NAME
FROM Splitted
Results:
As you can see it works even with random-spaced names.

Splitting a Full Name into First and Last Name

I have a list of customer whose name is given as a full name.
I want to create a function that takes the full name as parameter and returns the first and last name separately. If this is not possible I can have two separate functions one that returns the first name and the other that returns the last name. The full name list contains names that have a maximum of three words.
What I want is this:-
When a full name is composed of two words. The first one should be
the name and the second one should be the last name.
When a full name is composed of three words. The first and middle words should be the first name while the third word should be the last name.
Example:-
**Full Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Result:-
**First Name Last Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
I have search and found solutions that are not working as intended and would like some advice.
Keeping it short and simple
DECLARE #t TABLE(Fullname varchar(40))
INSERT #t VALUES('John Paul White'),('Peter Smith'),('Thomas')
SELECT
LEFT(Fullname, LEN(Fullname) - CHARINDEX(' ', REVERSE(FullName))) FirstName,
STUFF(RIGHT(FullName, CHARINDEX(' ', REVERSE(FullName))),1,1,'') LastName
FROM
#t
Result:
FirstName LastName
John Paul White
Peter Smith
Thomas NULL
If you are certain that your names will only ever be two or three words, with single spaces, then we can rely on the base string functions to extract the first and last name components.
SELECT
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col, 1,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) - 1)
ELSE SUBSTRING(col, 1, CHARINDEX(' ', col) - 1)
END AS first,
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) + 1,
LEN(col) - CHARINDEX(' ', col, CHARINDEX(' ', col)))
ELSE SUBSTRING(col,
CHARINDEX(' ', col) + 1,
LEN(col) - CHARINDEX(' ', col))
END AS last
FROM yourTable;
Yuck, but it seems to work. My feeling is that you should fix your data model at some point. A more ideal place to scrub your name data would be outside the database, e.g. in Java. Or, better yet, fix the source of your data such that you record proper first and last names from the very beginning.
Demo here:
Rextester
Another option (just for fun) is to use a little XML in concert with an CROSS APPLY
Example
Select FirstName = ltrim(reverse(concat(Pos2,' ',Pos3,' ',Pos4,' ',Pos5)))
,LastName = reverse(Pos1)
From YourTable A
Cross Apply (
Select Pos1 = xDim.value('/x[1]','varchar(max)')
,Pos2 = xDim.value('/x[2]','varchar(max)')
,Pos3 = xDim.value('/x[3]','varchar(max)')
,Pos4 = xDim.value('/x[4]','varchar(max)')
,Pos5 = xDim.value('/x[5]','varchar(max)')
From (Select Cast('<x>' + replace(reverse(A.[Full Name]),' ','</x><x>')+'</x>' as xml) as xDim) XMLData
) B
Returns
FirstName LastName
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Cher
Sally Anne Bella Donna Baxter
You're trying to do two things at once...I won't solve for you, but here's the direction I'd take:
1) Check this out for string splitting: https://ole.michelsen.dk/blog/split-string-to-table-using-transact-sql.html. This will allow you to parse the name into a temp table and you can perform your logic on it to create names based on your rules
2) Create this as a table-valued function so that you can return a single row of parsed FirstName, LastName from your parameter. That way you can join to it and include in your results
Have you tried by Using PARSENAME Function?
The last method in splitting a full name into its corresponding first name and last name is the use of the PARSENAME string function, as can be seen from the following script:
DECLARE #FullName VARCHAR(100)
SET #FullName = 'John White Doe'
SELECT CONCAT(PARSENAME(REPLACE(#FullName, ' ', '.'), 3),' ',PARSENAME(REPLACE(#FullName, ' ', '.'), 2)) AS [FirstName],
PARSENAME(REPLACE(#FullName, ' ', '.'), 1) AS [LastName]
For more information, Goto this Site
This is the output..
Make it a table-valued function.
see here for an example
And this is the code you need to create your function. Basically you just need to split your LastName
IF OBJECT_ID(N'dbo.ufnParseName', N'TF') IS NOT NULL
DROP FUNCTION dbo.ufnParseName;
GO
CREATE FUNCTION dbo.ufnParseName(#FullName VARCHAR(300))
RETURNS #retParseName TABLE
(
-- Columns returned by the function
FirstName nvarchar(150) NULL,
LastName nvarchar(50) NULL
)
AS
-- Returns the spliced last name.
BEGIN
DECLARE
#FirstName nvarchar(250),
#LastName nvarchar(250);
-- Get common contact information
SELECT #LastName = RTRIM(RIGHT(#FullName, CHARINDEX(' ', REVERSE(#FullName)) - 1));
SELECT #FirstName = LTRIM(RTRIM(Replace(#FullName, #LastName, '')))
INSERT #retParseName
SELECT #FirstName, #LastName;
RETURN;
END
You can run as SELECT * FROM dbo.ufnParseName('M J K');
Why Table-Valued-Function
You can get rid off the duplication of your sql query and achieve DRY
You can try the below query. It is written as per your requirement and it only handles full_name with 2 or 3 parts in it.
;WITH cte AS(
SELECT full_name, (LEN(full_name) - LEN(REPLACE(full_name, ' ', '')) + 1) AS size FROM #temp
)
SELECT FirstName =
CASE
WHEN size=3 THEN PARSENAME(REPLACE(full_name, ' ', '.'), 3) + ' ' + PARSENAME(REPLACE(full_name, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(full_name, ' ', '.'), 2)
END,
PARSENAME(REPLACE(full_name, ' ', '.'), 1) AS LastName
FROM cte

How to get the middle word using Substring via Charindex of Second Position?

Basically what I am trying to do is that I want to get the middle word, using the second occurrence of the same character (on this case, dash "-").
This is the sample input:
declare #word nvarchar(max)
set #word = 'Technical Materials - Conversion - Team Dashboard'
There are three parts on this sentence, and they are divided by '-' dash line.
The first part is 'Technical Materials' which I am able to get using:
SELECT LTRIM(RTRIM(SUBSTRING(#word, 0, CHARINDEX('-', #word, 0))))
The last set was 'Team Dashboard' which I am able to get using:
SELECT CASE WHEN LEN(#word) - LEN(REPLACE(#word, '-', '')) = 1
THEN NULL
ELSE
RIGHT(#word,CHARINDEX('-', REVERSE(#word))-1)
END
The problem was, I am having a hard time getting the middle words which is 'Conversion' in this example.
If the format is fixed, you can use PARSENAME to achieve your expectation:
DECLARE #Word AS NVARCHAR(MAX) = 'Technical Materials - Conversion - Team Dashboard'
SELECT PARSENAME(REPLACE(#Word, '-', '.'), 2)
if you want to trim the extra spaces, then:
SELECT LTRIM(RTRIM(PARSENAME(REPLACE(#Word, '-', '.'), 2)))
Try this query:
SELECT
SUBSTRING(#word,
CHARINDEX('-', #word) + 2,
CHARINDEX('-', #word, CHARINDEX('-', #word) + 1) -
CHARINDEX('-', #word) - 3)
FROM yourTable
The general strategy here is to use SUBSTRING(), which requires the starting and ending positions of the middle string in question. We can use CHARINDEX to find both the first and second dash in the string. From this, we can compute the positions of the middle substring we want.
Demo here:
Rextester
This will find the text between the first 2 occurrences of '-'
DECLARE #word nvarchar(max)
SET #word = 'Technical Materials - Conversion - Team Dashboard'
SELECT SUBSTRING(x, 0, charindex('-', x))
FROM (values(stuff(#word, 1, charindex('-', #word), ''))) x(x)
This will find the middle element. In case of an even number of elements it will pick the first of the 2 middle elements
DECLARE #word nvarchar(max)
SET #word = 'Technical Materials - Conversion - Team Dashboard'
;WITH CTE(txt, rn, cnt) as
(
SELECT
t.c.value('.', 'VARCHAR(2000)'),
row_number() over (order by (select 1)), count(*) over()
FROM (
SELECT x = CAST('<t>' +
REPLACE(#word, ' - ', '</t><t>') + '</t>' AS XML)
) a
CROSS APPLY x.nodes('/t') t(c)
)
SELECT txt
FROM CTE
WHERE (cnt+1) / 2 = rn

Removing one word in a string (or between two white spaces)

I have this:
Dr. LeBron Jordan
John Bon Jovi
I would like this:
Dr. Jordan
John Jovi
How do I come about it? I think it's regexp_replace.
Thanks for looking.
Any help is much appreciated.
Here's a way using regexp_replace as you mentioned, using several forms of a name for testing. More powerful than nested SUBSTR(), INSTR() but you need to get your head around regular expressions, which will allow you way more pattern matching power for more complex patterns once you learn it:
with tbl as (
select 'Dr. LeBron Jordan' data from dual
union
select 'John Bon Jovi' data from dual
union
select 'Yogi Bear' data from dual
union
select 'Madonna' data from dual
union
select 'Mr. Henry Cabot Henhouse' data from dual )
select regexp_replace(data, '^([^ ]*) .* ([^ ]*)$', '\1 \2') corrected_string from tbl;
CORRECTED_STRING
----------------
Dr. Jordan
John Jovi
Madonna
Mr. Henhouse
Yogi Bear
The regex can be read as:
^ At the start of the string (anchor the pattern to the start)
( Start remembered group 1
[^ ]* Zero or more characters that are not a space
) End remembered group 1
space Where followed by a literal space
. Followed by any character
* Followed by any number of the previous any character
space Followed by another literal space
( Start remembered group 2
[^ ]* Zero or more characters that are not a space
) End remembered group 2
$ Where it occurs at the end of the line (anchored to the end)
Then the '\1 \2' means return remembered group 1, followed by a space, followed by remembered group 2.
If the pattern cannot be found, the original string is returned. This can be seen by surrounding the returned groups with square brackets and running again:
...
select regexp_replace(data, '^([^ ]*) .* ([^ ]*)$', '[\1] [\2]')
corrected_string from tbl;
CORRECTED_STRING
[Dr.] [Jordan]
[John] [Jovi]
Madonna
[Mr.] [Henhouse]
Yogi Bear
If it is only two words, it will return that. ("Lebron Jordan" will return "Lebron Jordan")
If it is three words, it will take out the middle word ("Dr. LeBron Jordan" will return "Dr. Jordan")
DECLARE #firstSpace int = 0
DECLARE #secondSpace int = 0
DECLARE #string nvarchar(50) = 'Dr. Lebron Jordan'
SELECT #string = LTRIM(RTRIM(#string))
SELECT #firstSpace = CHARINDEX(' ', #string, 0)
SELECT #secondSpace = CHARINDEX(' ', #string, #firstSpace + 1)
IF #secondSpace = 0
BEGIN
SELECT #string
END
ELSE
BEGIN
SELECT SUBSTRING(#string, 0, #firstSpace) + SUBSTRING(#string, #secondSpace, (LEN(#string) - #secondSpace) + 1)
END
Try below single statement in SQL Server:
declare #fullname varchar(200)
select #fullname='John Bon Jovi'
select substring(#fullname,1,charindex(' ',#fullname,1)) + substring(#fullname, charindex(' ',#fullname,charindex(' ',#fullname,1)+1)+1, len(#fullname) - charindex(' ',#fullname,charindex(' ',#fullname,1)))
Try below statement in Oracle
select substr(name,1,INSTR(name,' ', 1, 1))
|| substr(name, INSTR(name,' ', 1, 2)+1,length(name) - INSTR(name,' ', 1, 2))from temp
I tried same example, please refer fiddle link : http://sqlfiddle.com/#!4/74986/31

Strip a date from varchar field in SQL

I have a varchar field such as 'Created by blogs 16/10/2014 #123456 more stuff'.
I have stripped out the characters e.g. stuff(Col, 1, patindex('%[0-9]%', Col)-1, '') and also replaced the #, so that I am left with 16/10/2016 123456.
I now want to remove the date. The date could be positioned before or after the 123456. The date may be in various formats.
My end goal is to be left with 123456.
Try this. It assumes that the number is prefixed with '#' and followed by a space (or is last).:
DECLARE #Str VARCHAR(100)
SET #Str = 'Created by blogs 16/10/2014 #123456 more stuff'
SELECT
SUBSTRING(
SUBSTRING(#Str, CHARINDEX('#', #Str) + 1, 999),
1,
CHARINDEX(' ',
SUBSTRING(#Str + ' ',
CHARINDEX('#', #Str) + 1, 999)))
Replace #Str with your column name. Added a bugfix for when the number is last
Try this
DECLARE #YourString VARCHAR(100)
SET #YourString = 'Created by blogs 16/10/2014 #123456 more stuff'
SELECT LEFT(RIGHT(#YourString,PATINDEX('%[0-9]%',#YourString)-1),CHARINDEX(' ',RIGHT(#YourString,PATINDEX('%[0-9]%',#YourString)-1)))