T-SQL splitting column on a delimiter [duplicate] - sql

This question already has answers here:
T-SQL split string
(27 answers)
Closed 4 years ago.
I have a column with three groups data delimited by a forward slash, like so
AB/1234/10
The column is always formatted the same in every row, with 2 characters, a slash, some number of characters, a slash, and then 2 more characters. I need to split this one column into three. So the above example becomes
Column1 Column2 Column3
AB 1234 10
I'm not quite sure how to go about this. I've been using SELECT SUBSTRING but that isn't quite giving me what I need.
select SUBSTRING(MyColumn, 1, CHARINDEX('/', MyColumn, 1)-1)
FROM MyTable
Will return AB, and that's great. But I can't wrap my mind around how to grab the middle and the end sections. I thought that
select SUBSTRING(MyColumn, 4, CHARINDEX('/', MyColumn, 4))
FROM MyTable
Would work in grabbing the middle, but it returns 1234/10
I hope my question is clear and I would appreciate any advice pointing me in the right direction, Thank you.

You can work with fixed offsets, since you defined that the string always starts with two, and ends with two characters.
Here is a full working example:
DECLARE #tmp TABLE (
Merged nvarchar(max)
)
INSERT INTO #tmp SELECT N'AB/1234/10'
INSERT INTO #tmp SELECT N'AB/ANYNUMBEROF-CHARACTERS/10'
SELECT
LEFT(Merged,2) AS Column1,
SUBSTRING(Merged,4,LEN(Merged)-6) AS Column2,
RIGHT(Merged,2) AS Column3
FROM #tmp
We subtract the length of the string minus a constant (6 = two chars left, two chars right, two slashes) to extract the variable-length part from the middle.
Result:
Column1 Column2 Column3
AB 1234 10
AB ANYNUMBEROF-CHARACTERS 10

One approach is to use PARSENAME:
SELECT PARSENAME(REPLACE('AB/1234/10','/','.'), 3) Col1,
PARSENAME(REPLACE('AB/1234/10','/','.'), 2) Col2,
PARSENAME(REPLACE('AB/1234/10','/','.'), 1) Col3
This will replace the / with ., and pull out each section of the string with PARSENAME.
The benefit is that it will work with any length of characters in any position. The limits is that PARSENAME only handles up to 4 positions (in this case you are using 3), and will fail if periods . exist in the string already.

Try this. This should work with any number of characters.
DECLARE #str VARCHAR(100) = 'AB/1234/10'
SELECT LEFT(#str, CHARINDEX('/', #str) - 1) AS Column1
, SUBSTRING(#str, CHARINDEX('/', #str) + 1, CHARINDEX('/', SUBSTRING(#str, CHARINDEX('/', #str) + 1, LEN(#str))) - 1) AS Column2
, RIGHT(#str, CHARINDEX('/', REVERSE(#str)) - 1) AS Column3

Related

SQL Server select part of string

I have table like this
Column1
-------------------------------------------------------
nn=APPLE IPod 44454,o=0006,o=TOP8,nn=USA,nn=TOP8+
nn=SAMSUNG G5 487894,o=04786,o=T418,nn=JPN,nn=TO478+
And I need update that table and get result like this
Column1 Column2
---------------------------------------------------------------
nn=APPLE IPod 44454,o=0006,o=TOP8,nn=USA,nn=TOP8+ 44454
nn=SAMSUNG G5 487894,o=04786,o=T418,nn=JPN,nn=TO478+ 487894
My idea is but I can not fit with first character:
update tablename
set [column2] = SUBSTRING(column1, 1, CHARINDEX(' ', column1) - 1 (column1, CHARINDEX(' ', column1) + 1, LEN(column1))
This query can help you.
UPDATE tablename SET [column2] =
SUBSTRING((LEFT(column1,CHARINDEX(',',column1)-1)), CHARINDEX(' ', column1, CHARINDEX(' ',column1) +1 ) +1 ,LEN(column1))
Okay, given the second sample record, it looks like what you need is the last element of the space-delimited string in the first position of the comma-delimited string. So write yourself (or find) a string-splitter function that accepts a string and a delimiter, and then your parsing logic is:
split the field at the commas
take the first element
split that element at the spaces
take the last element
Does that make sense?
The following answer is based only on the two records you actually showed us. If we were to derive a rule from this, it might be that we want to extract the previous word (a product number) occurring immediately before the first comma in the string. If so, then we can try the following logic:
isolate the substring before the comma (e.g. nn=APPLE Ipod 44454)
reverse that string (45444 dopI ELPPA=nn)
then take the substring of that before the first space (45444)
finally reverse this substring to yield the product number we want (44454)
Consider the following query, with data imported via a CTE:
WITH yourTable AS (
SELECT 'nn=APPLE IPod 44454,o=0006,o=TOP8,nn=USA,nn=TOP8+' AS column1 UNION ALL
SELECT 'nn=SAMSUNG G5 487894,o=04786,o=T418,nn=JPN,nn=TO478+'
),
update_cte AS (
SELECT
column1,
column2,
REVERSE(
SUBSTRING(REVERSE(SUBSTRING(column1, 1, CHARINDEX(',', column1)-1)),
1,
CHARINDEX(' ', REVERSE(SUBSTRING(column1, 1, CHARINDEX(',', column1)-1))) - 1)
) AS col2
FROM yourTable
)
SELECT column1, col2 FROM update_cte;
Output:
Demo here:
Rextester
If you wanted to update your table to bring in these column2 product IDs, then you can use the above CTE fairly easily:
UPDATE update_cte
SET column2 = col2;

SQL Server query to remove the last word from a string

There's already an answer for this question in SO with a MySQL tag. So I just decided to make your lives easier and put the answer below for SQL Server users. Always happy to see different answers perhaps with a better performance.
Happy coding!
SELECT SUBSTRING(#YourString, 1, LEN(#YourString) - CHARINDEX(' ', REVERSE(#YourString)))
Edit: Make sure #YourString is trimmed first as Alex M has pointed out:
SET #YourString = LTRIM(RTRIM(#YourString))
Just an addition to answers.
The doc for LEN function in MSSQL:
LEN excludes trailing blanks. If that is a problem, consider using the DATALENGTH (Transact-SQL) function which does not trim the string. If processing a unicode string, DATALENGTH will return twice the number of characters.
The problem with the answers here is that trailing spaces are not accounted for.
SELECT SUBSTRING(#YourString, 1, LEN(#YourString) - CHARINDEX(' ', REVERSE(#YourString)))
As an example few inputs for the accepted answer (above for reference), which would have wrong results:
INPUT -> RESULT
'abcd ' -> 'abc' --last symbol removed
'abcd 123 ' -> 'abcd 12' --only removed only last character
To account for the above cases one would need to trim the string (would return the last word out of 2 or more words in the phrase):
SELECT SUBSTRING(RTRIM(#YourString), 1, LEN(#YourString) - CHARINDEX(' ', REVERSE(RTRIM(LTRIM(#YourString)))))
The reverse is trimmed on both sides, that is to account for the leading as well as trailing spaces.
Or alternatively, just trim the input itself.
DECLARE #Sentence VARCHAR(MAX) = 'Hi This is Pavan Kumar'
SELECT SUBSTRING(#Sentence, 1, CHARINDEX(' ', #Sentence) - 1) AS [First Word],
REVERSE(SUBSTRING(REVERSE(#Sentence), 1,
CHARINDEX(' ', REVERSE(#Sentence)) - 1)) AS [Last Word]
DECLARE #String VARCHAR(MAX) = 'One two three four'
SELECT LEFT(#String,LEN(#String)-CHARINDEX(' ', REVERSE(#String),0)+1)
All the answers so far are actually about removing a character, not a word as the OP wanted.
In my case I was building a dynamic SQL statement with UNION'd SELECT statements and wanted to remove the last UNION:
DECLARE #sql NVARCHAR(MAX) = ''
/* populate #sql with something like this:
SELECT 1 FROM dbo.T1 WHERE condition
UNION
SELECT 1 FROM dbo.T2 WHERE condition
UNION
SELECT 1 FROM dbo.T3 WHERE condition
UNION
SELECT 1 FROM dbo.T4 WHERE condition
UNION
*/
-- remove the last UNION
SET #sql = SUBSTRING(#sql, 1, LEN(#sql) - PATINDEX(REVERSE('%UNION%'), REVERSE(#sql)) - LEN('UNION'))
SELECT LEFT(username , LEN(json_path) - CHARINDEX('/', REVERSE(username ))+1)
FROM Login_tbl
UPDATE Login_tbl
SET username = LEFT(username , LEN(json_path) - CHARINDEX('/', REVERSE(username ))+1)
DECLARE #String VARCHAR(MAX) = 'One two three four'
SELECT LEFT(#String,LEN(#String)-CHARINDEX(' ', REVERSE(#String),0)+1)

TSQL How to get the 2nd number from a string

We have the below in row in MS SQL:
Got event with: 123.123.123.123, event 34, brown fox
How can we extract the 2nd number ie the 34 reliable in one line of SQL?
Here's one way to do it using SUBSTRING and PATINDEX -- I used a CTE just so it wouldn't look so awful :)
WITH CTE AS (
SELECT
SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data)) data
FROM Test
)
SELECT LEFT(SUBSTRING(Data, PATINDEX('%[0-9]%', Data), 8000),
PATINDEX('%[^0-9]%',
SUBSTRING(Data, PATINDEX('%[0-9]%', Data), 8000) + 'X')-1)
FROM CTE
And here is some sample Fiddle.
As commented, CTEs will only work with 2005 and higher. If by chance you're using 2000, then this will work without the CTE:
SELECT LEFT(SUBSTRING(SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data)),
PATINDEX('%[0-9]%', SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data))), 8000),
PATINDEX('%[^0-9]%',
SUBSTRING(SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data)),
PATINDEX('%[0-9]%', SUBSTRING(Data,CHARINDEX(',',Data)+1,LEN(Data))), 8000) + 'X')-1)
FROM Test
Simply replace #s with your column name to apply this to a table. Assuming that number is between last comma and space before the last comma. Sql-Fiddle-Demo
declare #s varchar(100) = '123.123.123.123, event 34, brown fox'
select right(first, charindex(' ', reverse(first),1) ) final
from (
select left(#s,len(#s) - charindex(',',reverse(#s),1)) first
--from tableName
) X
OR if it is between first and second commas then try, DEMO
select substring(first, charindex(' ',first,1),
charindex(',', first,1)-charindex(' ',first,1)) final
from (
select right(#s,len(#s) - charindex(',',#s,1)-1) first
) X
I've thought of another way that's not been mentioned yet. Presuming the following are true:
Always one comma before the second "part"
It's always the word "event" with the number in the second part
You are using SQL Server 2005+
Then you could use the built in ParseName function meant for parsing the SysName datatype.
--Variable to hold your example
DECLARE #test NVARCHAR(50)
SET #test = 'Got event with: 123.123.123.123, event 34, brown fox'
SELECT Ltrim(Rtrim(Replace(Parsename(Replace(Replace(#test, '.', ''), ',', '.'), 2), 'event', '')))
Results:
34
ParseName parses around dots, but we want it to parse around commas. Here's the logic of what I've done:
Remove all existing dots in the string, in this case swap them with empty string.
Swap all commas for dots for ParseName to use
Use ParseName and ask for the second "piece". In your example this gives us the value
" event 34".
Remove the word "event" from the string.
Trim both ends and return the value.
I've no comments on performance vs. the other solutions, and it looks just as messy. Thought I'd throw the idea out there anyway!

String parsing in SQL Server

I have string column email_id; the data will look like this:
email_id
"1"
"6"
"3 4"
"8"
"0 3"
"0 5 7"
I want to get list of ids as integer. If I have two numbers in my string, I want the last one. My result should look like;
SELECT some_function (email_id ) FROM table
1
6
4
8
3
7
Is it possible to do this in SQL Server?
SELECT
CAST(RIGHT(email_id, LEN(email_id) - CHARINDEX(' ', email_id)) AS INT)
FROM
yourTable
IF and ONLY IF, all your values can reliably be cast to an INT, and there is only ever one space at most.
EDIT To deal with a list of n values
This isn't pretty, but it avoid recurrsion and/or loops. If someone gives an answer without REVERSE() test to see if it's faster than this or not.
SELECT
CAST(
REVERSE(
LEFT(
REVERSE(email_id),
CHARINDEX(' ', REVERSE(email_id) + ' ') - 1
)
)
AS INT
)
FROM
yourTable
SELECT CAST(replace(your_column ,' ','') as int) FROM table

Uppercase first two characters in a column in a db table

I've got a column in a database table (SQL Server 2005) that contains data like this:
TQ7394
SZ910284
T r1534
su8472
I would like to update this column so that the first two characters are uppercase. I would also like to remove any spaces between the first two characters. So T q1234 would become TQ1234.
The solution should be able to cope with multiple spaces between the first two characters.
Is this possible in T-SQL? How about in ANSI-92? I'm always interested in seeing how this is done in other db's too, so feel free to post answers for PostgreSQL, MySQL, et al.
Here is a solution:
EDIT: Updated to support replacement of multiple spaces between the first and the second non-space characters
/* TEST TABLE */
DECLARE #T AS TABLE(code Varchar(20))
INSERT INTO #T SELECT 'ab1234x1' UNION SELECT ' ab1234x2'
UNION SELECT ' ab1234x3' UNION SELECT 'a b1234x4'
UNION SELECT 'a b1234x5' UNION SELECT 'a b1234x6'
UNION SELECT 'ab 1234x7' UNION SELECT 'ab 1234x8'
SELECT * FROM #T
/* INPUT
code
--------------------
ab1234x3
ab1234x2
a b1234x6
a b1234x5
a b1234x4
ab 1234x8
ab 1234x7
ab1234x1
*/
/* START PROCESSING SECTION */
DECLARE #s Varchar(20)
DECLARE #firstChar INT
DECLARE #secondChar INT
UPDATE #T SET
#firstChar = PATINDEX('%[^ ]%',code)
,#secondChar = #firstChar + PATINDEX('%[^ ]%', STUFF(code,1, #firstChar,'' ) )
,#s = STUFF(
code,
1,
#secondChar,
REPLACE(LEFT(code,
#secondChar
),' ','')
)
,#s = STUFF(
#s,
1,
2,
UPPER(LEFT(#s,2))
)
,code = #s
/* END PROCESSING SECTION */
SELECT * FROM #T
/* OUTPUT
code
--------------------
AB1234x3
AB1234x2
AB1234x6
AB1234x5
AB1234x4
AB 1234x8
AB 1234x7
AB1234x1
*/
UPDATE YourTable
SET YourColumn = UPPER(
SUBSTRING(
REPLACE(YourColumn, ' ', ''), 1, 2
)
)
+
SUBSTRING(YourColumn, 3, LEN(YourColumn))
UPPER isn't going to hurt any numbers, so if the examples you gave are completely representative, there's not really any harm in doing:
UPDATE tbl
SET col = REPLACE(UPPER(col), ' ', '')
The sample data only has spaces and lowercase letters at the start. If this holds true for the real data then simply:
UPPER(REPLACE(YourColumn, ' ', ''))
For a more specific answer I'd politely ask you to expand on your spec, otherwise I'd have to code around all the other possibilities (e.g. values of less than three characters) without knowing if I was overengineering my solution to handle data that wouldn't actually arise in reality :)
As ever, once you've fixed the data, put in a database constraint to ensure the bad data does not reoccur e.g.
ALTER TABLE YourTable ADD
CONSTRAINT YourColumn__char_pos_1_uppercase_letter
CHECK (ASCII(SUBSTRING(YourColumn, 1, 1)) BETWEEN ASCII('A') AND ASCII('Z'));
ALTER TABLE YourTable ADD
CONSTRAINT YourColumn__char_pos_2_uppercase_letter
CHECK (ASCII(SUBSTRING(YourColumn, 2, 1)) BETWEEN ASCII('A') AND ASCII('Z'));
#huo73: yours doesn't work for me on SQL Server 2008: I get 'TRr1534' instead of 'TR1534'.
update Table set Column = case when len(rtrim(substring (Column , 1 , 2))) < 2
then UPPER(substring (Column , 1 , 1) + substring (Column , 3 , 1)) + substring(Column , 4, len(Column)
else UPPER(substring (Column , 1 , 2)) + substring(Column , 3, len(Column) end
This works on the fact that if there is a space then the trim of that part of string would yield length less than 2 so we split the string in three and use upper on the 1st and 3rd char. In all other cases we can split the string in 2 parts and use upper to make the first two chars to upper case.
If you are doing an UPDATE, I would do it in 2 steps; first get rid of the space (RTRIM on a SUBSTRING), and second do the UPPER on the first 2 chars:
// uses a fixed column length - 20-odd in this case
UPDATE FOO
SET bar = RTRIM(SUBSTRING(bar, 1, 2)) + SUBSTRING(bar, 3, 20)
UPDATE FOO
SET bar = UPPER(SUBSTRING(bar, 1, 2)) + SUBSTRING(bar, 3, 20)
If you need it in a SELECT (i.e. inline), then I'd be tempted to write a scalar UDF