How to split string with inconsistent order format in SQL

How to split string with inconsistent order format in SQL - sql

I want to split the strings in the 'Scorer' column so that the scorer name is retained but not the score type (i.e. to remove the text within the brackets and the brackets to just leave the scorer name in that field).
Scorer
Ellis J.(Conversion Goal)
Ellis J.(Try)
Ellis J.(Conversion Goal)
Trueman J.(Try)
(Conversion Goal)Brough D.
(Try)McGillvary J.
(Try)McGillvary J.
(Penalty Goal)Brough D.
Ellis J.(Conversion Goal)
It should look like the below.
Scorer
Ellis J.
Ellis J.
Ellis J.
Trueman J.
Brough D.
McGillvary J.
McGillvary J.
Brough D.
Ellis J.

The correct solution would be to fix the database structure by adding another column to the table for the score type. In fact, you should probably have a table for score types and add a foreign key to it from this table.
Assuming you can't change the database structure, this is better done at the presentation layer. Any programming language should enable you do do it quite easily. String manipulation is not SQL's strong suit.
That being said, it can certainly be done using pure T-SQL - with a simple common table expression to get the brackets indexes using charindex, and a case expression with stuff in the select statement.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
Scorer nvarchar(100)
);
INSERT INTO #T (Scorer) VALUES
('Ellis J.(Conversion Goal)'),
('Ellis J.(Try)'),
('Ellis J.(Conversion Goal)'),
('Trueman J.(Try)'),
('(Conversion Goal)Brough D.'),
('(Try)McGillvary J.'),
('(Try)McGillvary J.'),
('(Penalty Goal)Brough D.'),
('Ellis J.(Conversion Goal)'),
-- Note: I've added some edge cases to the sample data:
('a row with (brackets) in the middle'),
('Just an open bracket (forgot to close '),
('Just a close bracket forgot to open)'),
('no brackets at all'),
('brackets ) in reversed order (');
Then, the CTE:
WITH CTE AS
(
SELECT Scorer,
CHARINDEX('(', Scorer) As OpenBrackets,
CHARINDEX(')', Scorer) As CloseBrackets
FROM #T
)
The select statement:
SELECT CASE WHEN OpenBrackets > 0 AND CloseBrackets > OpenBrackets
THEN
STUFF(Scorer, OpenBrackets, CloseBrackets - OpenBrackets + 1, '')
ELSE
Scorer
END As Scorer
FROM CTE
Results:
Scorer
Ellis J.
Ellis J.
Ellis J.
Trueman J.
Brough D.
McGillvary J.
McGillvary J.
Brough D.
Ellis J.
a row with in the middle
Just an open bracket (forgot to close
Just a close bracket forgot to open)
no brackets at all
brackets ) in reversed order (

Below query works for you
SELECT LTRIM(RTRIM(REPLACE(Scorer, SUBSTRING(Scorer, CHARINDEX('(', Scorer), CHARINDEX(')', Scorer) - CHARINDEX('(', Scorer) + 1), '')))
FROM <TABLENAME>

These two pieces of information (the name and action) should not be in the same column. You should create a separate column for name and for action. And if the position of the action (before or after the name) is important, you might even need an additional column for that.
When you have migrated your data after that - in other words when you have cleaned up - you could still create a view or a computed column to output the scorer the way you do now, for example
ALTER TABLE my_table ADD scorer AS athlete_name + ' (' + action + ')'

You could try:
SELECT Scorer
,CASE WHEN PATINDEX('%(%)%',Scorer) > 1
THEN LEFT(Scorer, PATINDEX('%(%)%',Scorer)-1)
ELSE RIGHT (Scorer, LEN(Scorer) - CHARINDEX(')',Scorer,1) )
END AS ColumnName
FROM ScoreTable
this should work assuming you only expect 1 instance if the pattern per row, but will work whether the "()" data is at the front or the back of the values

You can use this query
with t(str) as
(
select 'Ellis J.(Conversion Goal)' union all
select '(Conversion Goal)Brough D.' union all
select ' (Try)McGillvary J.'
)
select (case when charindex('(', ltrim(str)) = 1 then
substring(str,charindex(')', str)+1,len(str))
else
left(str, charindex('(', str) - 1)
end) as "Scorers"
from t
Scorers
--------------
Ellis J.
Brough D.
McGillvary J.
by contribution of substring, charindex and left functions together. ltrim is used against probabilty of spaces left before ( character at the beginning of the string.
Rextester Demo

Related

Pulling a section of a string between two characters in SQL, and the section of the string around the extracted section

I have a table that includes names and allows for a "nickname" for each name in parenthesis.
PersonName
John (Johnny) Hendricks
Zekeraya (Zeke) Smith
Ajamain Sterling (Aljo)
Beth ()) Jackson
I need to extract the Nickname, and return a column of nicknames and a column of full names (Full string without the nickname portion in parenthesis). I also need a condition for the nickname to be null if no nickname exists, and so that the nickname only returns letters. So far I have been able to figure out how to get the nickname out using Substring, but I can't figure out how to create a separate column for just the name.
Select SUBSTRING(PersonName, CHARINDEX('(', PersonName) +1,(((LEN(PersonName))-CHARINDEX(')',REVERSE(PersonName)))-CHARINDEX('(',PersonName)))
as NickName
from dbo.Person
Any help would be appreciated. I'm using MS SQL Server 2019. I'm pretty new at this, as you can tell.

Using your existing substring, one simple way is to use apply.
Assuming your last row is an example of a nickname that should be NULL, you can use an inline if to check its length - presumably a nickname must be longer than 1 character? Adjust this logic as required.
select PersonName, Iif(Len(nn)<2,null,nn) NickName, Trim(Replace(Replace(personName, Concat('(',nn,')') ,''),' ','')) FullName
from Person
cross apply (values(SUBSTRING(PersonName, CHARINDEX('(', PersonName) +1,(((LEN(PersonName))-CHARINDEX(')',REVERSE(PersonName)))-CHARINDEX('(',PersonName))) ))c(nn)

The following code will deal correctly with missing parenthesis or empty strings.
Note how the first CROSS APPLY feeds into the next
SELECT
PersonName,
NULLIF(NickName, ''),
FullName = ISNULL(REPLACE(personName, ' (' + NickName + ')', ''), PersonName)
FROM t
CROSS APPLY (VALUES(
NULLIF(CHARINDEX('(', PersonName), 0))
) v1(opening)
CROSS APPLY (VALUES(
SUBSTRING(
PersonName,
v1.opening + 1,
NULLIF(CHARINDEX(')', PersonName, v1.opening), 0) - v1.opening - 1
)
)) v2(NickName);
db<>fiddle

Adding comma after first word in column

I have a table called people and a column called NAME which has last first middle (if exists) . so SMITH JOHN J I'd like to add a comma after first word so it updates to SMITH, JOHN J
I tried running this but it blew up:
update people
set name = (CHARINDEX(' ', 0), 0, ',')
I know I'm close but it's eluding me :(

You can use STUFF() along with CHARINDEX() and LEFT() for this:
update people
set name = STUFF(name,1,CHARINDEX(' ',name )-1,LEFT(name ,CHARINDEX(' ',name )-1)+', ')
WHERE CHARINDEX(' ',name) > 0
Might add a WHERE to ensure there is a space in the name so it doesn't error, or a CASE expression.
Could also use REPLACE() with CHARINDEX() and LEFT():
REPLACE(name,LEFT(name,CHARINDEX(' ',name)-1),LEFT(name,CHARINDEX(' ',name)-1)+',')

Unexpected execution in an update query in SQL

I am getting an 'Unexpected' result with an update query in SQL Server 2012.
This is what I am trying to do.
From a column (IDENTIFIER) composed by an ID ','name (e.g. 258967,Sarah Jones), I have to fill other two columns: ID and SELLER_NAME.
The original column has some values with a blank at the end and the rest with out it:
'258967,Sarah Jones'
'98745,Richard James '
This is the update query that I am executing:
UPDATE SELLER
SET
IDENTIFIER = LTRIM(RTRIM(IDENTIFIER)),
ID = Left(IDENTIFIER , charindex(',', IDENTIFIER )-1),
SELLER_NAME = UPPER(RIGHT((IDENTIFIER ),LEN(IDENTIFIER )-CHARINDEX(',',IDENTIFIER )));
But I am having a wrong result at the end
258967,Sarah Jones 258967 SARAH JONES
98745,Richard James 98745 ICHARD JAMES
The same happens with all the names that has the blank at the end. At this point I wonder, if I have specified that I want to eliminate all the blanks at the begining and at the end of the value of IDENTIFIER as a first action, why the system updates the ID and SELLER_NAMES and then does this action?.
Just to specify: The IDENTIFIER column is part of the seller table which is updating from another person that imports the data from an Excel file. I receive this values and I have to normalize the information. I only can read the SELLER table, take this into account before answer

Try this, because you have space in right side of name, so it will just truncate one char from name. So just need to RTRIM(IDENTIFIER) and thats it.
SELLER_NAME = UPPER(RIGHT((RTRIM(IDENTIFIER)),LEN(IDENTIFIER )-CHARINDEX(',',IDENTIFIER)));

The design of your tables violates 1NF and is nothing but painful. Instead of doing all this crazy string manipulation you could leverage PARSENAME here quite easily.
with Something(SomeValue) as
(
select '258967,Sarah Jones' union all
select '98745,Richard James '
)
select *
, ltrim(rtrim(PARSENAME(replace(SomeValue, ',', '.'), 2)))
, ltrim(rtrim(PARSENAME(replace(SomeValue, ',', '.'), 1)))
from Something

Instead of using Right(), use SubString().
Here's an example. I've tried to show each step individually to illustrate
; WITH x (identifier) AS (
SELECT '258967,Sarah Jones'
UNION ALL
SELECT '98745,Richar James '
)
SELECT identifier
, CharIndex(',', identifier) As comma
, SubString(identifier, CharIndex(',', identifier) + 1, 1000) As name_only
, LTrim(RTrim(SubString(identifier, CharIndex(',', identifier) + 1, 1000))) As trimmed_name_only
FROM x
Note that the 1000 used should be the maximum length of the column definition or higher e.g. if your IDENTIFIER column is a varchar(2000) then use 2,000 instead.

try trim the IDENTIFIER first like this
SALLER_NAME = UPPER(RIGHT((RTRIM(IDENTIFIER),LEN(IDENTIFIER )-CHARINDEX(',',IDENTIFIER )));

TSQL How to get the 2nd number from a string

We have the below in row in MS SQL:
Got event with: 123.123.123.123, event 34, brown fox
How can we extract the 2nd number ie the 34 reliable in one line of SQL?

Here's one way to do it using SUBSTRING and PATINDEX -- I used a CTE just so it wouldn't look so awful :)
WITH CTE AS (
SELECT
SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data)) data
FROM Test
)
SELECT LEFT(SUBSTRING(Data, PATINDEX('%[0-9]%', Data), 8000),
PATINDEX('%[^0-9]%',
SUBSTRING(Data, PATINDEX('%[0-9]%', Data), 8000) + 'X')-1)
FROM CTE
And here is some sample Fiddle.
As commented, CTEs will only work with 2005 and higher. If by chance you're using 2000, then this will work without the CTE:
SELECT LEFT(SUBSTRING(SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data)),
PATINDEX('%[0-9]%', SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data))), 8000),
PATINDEX('%[^0-9]%',
SUBSTRING(SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data)),
PATINDEX('%[0-9]%', SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data))), 8000) + 'X')-1)
FROM Test

Simply replace #s with your column name to apply this to a table. Assuming that number is between last comma and space before the last comma. Sql-Fiddle-Demo
declare #s varchar(100) = '123.123.123.123, event 34, brown fox'
select right(first, charindex(' ', reverse(first),1) ) final
from (
select left(#s,len(#s) - charindex(',',reverse(#s),1)) first
--from tableName
) X
OR if it is between first and second commas then try, DEMO
select substring(first, charindex(' ',first,1),
charindex(',', first,1)-charindex(' ',first,1)) final
from (
select right(#s,len(#s) - charindex(',',#s,1)-1) first
) X

I've thought of another way that's not been mentioned yet. Presuming the following are true:
Always one comma before the second "part"
It's always the word "event" with the number in the second part
You are using SQL Server 2005+
Then you could use the built in ParseName function meant for parsing the SysName datatype.
--Variable to hold your example
DECLARE #test NVARCHAR(50)
SET #test = 'Got event with: 123.123.123.123, event 34, brown fox'
SELECT Ltrim(Rtrim(Replace(Parsename(Replace(Replace(#test, '.', ''), ',', '.'), 2), 'event', '')))
Results:
34
ParseName parses around dots, but we want it to parse around commas. Here's the logic of what I've done:
Remove all existing dots in the string, in this case swap them with empty string.
Swap all commas for dots for ParseName to use
Use ParseName and ask for the second "piece". In your example this gives us the value
" event 34".
Remove the word "event" from the string.
Trim both ends and return the value.
I've no comments on performance vs. the other solutions, and it looks just as messy. Thought I'd throw the idea out there anyway!

Postgres: order data by part of string

I have a column name that represents a person's name in the following format:
firstname [middlename] lastname [, Sr.|Jr.]
For, example:
John Smith
John J. Smith
John J. Smith, Sr.
How can I order items by lastname?

A correct and faster version could look like this:
SELECT *
FROM tbl
ORDER BY substring(name, '([^[:space:]]+)(?:,|$)')
Or:
ORDER BY substring(name, E'([^\\s]+)(?:,|$)')
Or even:
ORDER BY substring(name, E'([^\\s]+)(,|$)')
Explain
[^[:space:]]+ .. first (and longest) string consisting of one or more non-whitespace characters.
(,|$) .. terminated by a comma or the end of the string.
The last two examples use escape-string syntax and the class-shorthand \s instead of the long form [[:space:]] (which loses the outer level of brackets when inside a character class).
We don't actually have to use non-capturing parenthesis (?:) after the part we want to extract, because (quoting the manual):
.. if the pattern contains any parentheses, the portion of the text that
matched the first parenthesized subexpression (the one whose left
parenthesis comes first) is returned.
Test
SELECT substring(name, '([^[:space:]]+)(?:,|$)')
FROM (VALUES
('John Smith')
,('John J. Smith')
,('John J. Smith, Sr.')
,('foo bar Smith, Jr.')
) x(name)

SELECT *
FROM t
ORDER BY substring(name, E'^.*\\s([^\\s]+)(?=,|$)') ASC
While this should provide the sorting you are looking for, it would be a lot cheaper to store the name in multiple columns and index them based on which parts of the name you need to sort by.

You should use functional index for this purpose
http://www.postgresql.org/docs/7.3/static/indexes-functional.html
In your case somehow....
CREATE INDEX test1_lastname_col1_idx ON test1 (split_part(col1, ' ', 3));
SELECT * FROM test1 ORDER BY split_part(col1, ' ', 3);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to split string with inconsistent order format in SQL - sql

Below query works for you SELECT LTRIM(RTRIM(REPLACE(Scorer, SUBSTRING(Scorer, CHARINDEX('(', Scorer), CHARINDEX(')', Scorer) - CHARINDEX('(', Scorer) + 1), ''))) FROM <TABLENAME>

Related

Pulling a section of a string between two characters in SQL, and the section of the string around the extracted section

Adding comma after first word in column

Unexpected execution in an update query in SQL

TSQL How to get the 2nd number from a string

Postgres: order data by part of string

Categories

Resources