SQL Server - Extract text from a given string field - sql

What is the best method to extract the text from the string field below?
I am trying to extract the ProjectID numbers (91, 108, 250) below but am struggling because the ProjectIDs are either 2 or 3 integers long and are within different parts of the string.
Row Parameter
1 ProjectID=91&GroupID=250&ParentID=1
2 ProjectID=108&GroupID=250&ParentID=35
3 GroupID=1080&ProjectID=250&ParentID=43
4 ProjectID=250
Any help would be much appreciated.

SQL Server is kind of lousy on string functionality. Here is one method:
select left(v1.p, patindex('%[^0-9]%', v1.p + ' ') - 1)
from (values ('ProjectID=91&GroupID=250&ParentID=1'),
('ProjectID=108&GroupID=250&ParentID=35'),
('GroupID=1080&ProjectID=250&ParentID=43'),
('ProjectID=250')
) v(parameter) cross apply
(values (stuff(v.parameter, 1, charindex('ProjectID=', v.parameter) + 9, ''))
) v1(p);
Or split the string and look for a match:
select stuff(s.value, 1, 10, '')
from (values ('ProjectID=91&GroupID=250&ParentID=1'),
('ProjectID=108&GroupID=250&ParentID=35'),
('GroupID=1080&ProjectID=250&ParentID=43'),
('ProjectID=250')
) t(parameter) cross apply
string_split(t.parameter, '&') s
where s.value like 'ProjectId=%';
Here is a db<>fiddle.

SELECT
substring ( #Parameter,
CHARINDEX('ProjectID', #Parameter) + 10,
CHARINDEX('&', #parameter, CHARINDEX('ProjectID', #Parameter)) -
(CHARINDEX('ProjectID', #Parameter) + 10))
from table

Related

SQL Get string between second and third underscore

I need to extract a certain string from a column in a table as part of an SSIS package.
The contents of the column is formatted like this "TST_AB1_ABC123456_TEST".
I need to get the string between the second and 3rd "_", e.g. "ABC123456" without changing too much of the package so would rather do it in 1 SQL command if possible.
I've tried a few different methods using SUBSTRING, REVERSE and CHARINDEX but can't figure out how to get just that string.
Using the base string functions:
SELECT
SUBSTRING(col,
CHARINDEX('_', col, CHARINDEX('_', col) + 1) + 1,
CHARINDEX('_', col, CHARINDEX('_', col, CHARINDEX('_', col) + 1) + 1) -
CHARINDEX('_', col, CHARINDEX('_', col) + 1) - 1)
FROM yourTable;
In notes format, the above call to SUBSTRING is saying:
SELECT
SUBSTRING(<your column>,
<starting at one past the second underscore>,
<for a length of the number of characters in between the 2nd and 3rd
underscore>)
FROM yourTable;
On other databases, such as Postgres and Oracle, there are substring index and regex functions which can handle the above more gracefully. Actually, more recent versions of SQL Server have a STRING_SPLIT function, which could be used here, but it does not maintain the order of the resulting parts.
If your column values always have 4 parts you can use the PARSENAME() function like this.
DECLARE #MyString VARCHAR(100)
SET #MyString = 'TST_AB1_ABC123456_TEST';
SELECT PARSENAME(REPLACE(#MyString, '_', '.'), 2)
You could also do this using Cross Apply. I added in a where clause to make sure you don't get an error resulting from strings without 3 underscores
with your_table as (select 'TST_AB1_ABC123456_TEST' as txt1)
select txt1, txt2
from your_table t1
where txt1 like '%_%_%_%'
cross apply (select charindex( '_', txt1) as i1) t2 -- locate the 1st underscore
cross apply (select charindex( '_', txt1, (i1 + 1)) as i2 ) t3 -- then the 2nd
cross apply (select charindex( '_', txt1, (i2 + 1)) as i3 ) t4 -- then the 3rd
cross apply (select substring( txt1,(i2+1), (i3-i2-1)) as txt2) t5 -- between 2nd & 3rd
Outputs
+------------------------+-----------+
| txt1 | txt2 |
+------------------------+-----------+
| TST_AB1_ABC123456_TEST | ABC123456 |
+------------------------+-----------+
DEMO

sql string split for defined number of char

i've a string like 'aabbcczx' and i need to split that string by 2 char.
The result expected is something like:
aabbcczx aa
aabbcczx bb
aabbcczx cc
aabbcczx zx
How can I do this?
consider also that the length of the string change row by row.
Thanks
If it's always 2 chars:
SELECT A.Val,
CA1.N,
SUBSTRING(A.Val,n,2)
FROM (
VALUES ('aabbcczx')
) AS A(Val)
CROSS
APPLY dbo.GetNums(1,LEN(A.Val)) AS CA1
WHERE CA1.n % 2 = 1;
GetNums is a number table/tally table generator you can find some several sources online.
It will provide the position of each character and we can use that in the substring start position. The where clause uses MOD to so we only show every other starting position
You can use a recursive query:
with cte as (
select convert(varchar(max), left(str, 2)) as val2, convert(varchar(max), stuff(str, 1, 2, '')) as rest, str
from (values ( 'aabbcczx' )) v(str)
union all
select left(rest, 2) as val2, stuff(rest, 1, 2, '') as rest, str
from cte
where rest <> ''
)
select str, val2
from cte;
You can use a recursive query to extract pairs of characters:
with instring as
( select 'aabbcczx' as s )
, splitter as
(
select s, substring(s, 1, 2) as rslt, 3 as next -- first two chars
from instring
union all
select s, substring(s, next, 2), next + 2 -- next two chars
from splitter
where len(s) >= next
)
select *
from splitter
See dbfiddle

SQL Query to parse numbers from name

The DBMS in this case is SQL Server 2012.
I need a SQL query that will grab just the numbers from a device name. I've got devices that follow a naming scheme that SHOULD look like this:
XXXnnnnn
or
XXXnnnnn-XX
Where X is a letter and n is a number which should be left padded with 0's where appropriate. However, not all of the names are properly padded in this way.
So, imagine you have a column that looks something like this:
Name
----
XXX01234
XXX222
XXX0390-A2
XXX00965-A1
I need an SQL query that will return results from this example column as follows.
Number
------
01234
00222
00390
00965
Anyone have any thoughts? I've tried things like casting the name first as a float and then as an int, but to be honest, I'm just not skilled enough with SQL yet to find the solution.
Any help is greatly appreciated!
SQL Server does not have great string parsing functions. For your particular example, I think a case statement might be the simplest approach:
select (case when number like '___[0-9][0-9][0-9][0-9][0-9]%'
then substring(number, 4, 5)
when number like '___[0-9][0-9][0-9][0-9]%'
then '0' + substring(number, 4, 4)
when number like '___[0-9][0-9][0-9]%'
then '00' + substring(number, 4)
when number like '___[0-9][0-9]%'
then '000' + substring(number, 4, 2)
when number like '___[0-9][0-9]%'
then '0000' + substring(number, 4, 1)
else '00000'
end) as EmbeddedNumber
This might work :
SELECT RIGHT('00000'
+ SUBSTRING(Col, 1, ISNULL(NULLIF((PATINDEX('%-%', Col)), 0) - 1, LEN(Col))), 5)
FROM (SELECT REPLACE(YourColumn, 'XXX', '') Col
FROM YourTable)t
SQLFIDDLE
This will work even when XXX can be of different len:
DECLARE #t TABLE ( n NVARCHAR(50) )
INSERT INTO #t
VALUES ( 'XXXXXXX01234' ),
( 'XX222' ),
( 'X0390-A2' ),
( 'XXXXXXX00965-A1' )
SELECT REPLICATE('0', 5 - LEN(n)) + n AS n
FROM ( SELECT SUBSTRING(n, PATINDEX('%[0-9]%', n),
CHARINDEX('-', n + '-') - PATINDEX('%[0-9]%', n)) AS n
FROM #t
) t
Output:
n
01234
00222
00390
00965
If the first 3 chars are always needed to be removed, then you can do something like that (will work if the characters will start only after '-' sign):
DECLARE #a AS TABLE ( a VARCHAR(100) );
INSERT INTO #a
VALUES
( 'XXX01234' ),
( 'XXX222' ),
( 'XXX0390-A2' ),
( 'XXX00965-A1' );
SELECT RIGHT('00000' + SUBSTRING(a, 4, CHARINDEX('-',a+'-')-4),5)
FROM #a
-- OUTPUT
01234
00222
00390
00965
Another option (will extract numbers after first 3 characters):
SELECT
RIGHT('00000' + LEFT(REPLACE(a, LEFT(a, 3), ''),
COALESCE(NULLIF(PATINDEX('%[^0-9]%',
REPLACE(a, LEFT(a, 3), '')),
0) - 1,
LEN(REPLACE(a, LEFT(a, 3), '')))), 5)
FROM
#a;
-- OUTPUT
01234
00222
00390
00965

Selecting between quotes (") in SQL Server 2012

I have a table holding IDs in one column and a string in the second column like below.
COLUMN01 COLUMN02
----------------------------------------------------------------------------------
1 abc"11444,12,13"efg"14,15"hij"16,17,18,19"opqr
2 ahsdhg"21,22,23"ghshds"24,25"fgh"26,27,28,28"shgshsg
3 xvd"3142,32,33"hty"34,35"okli"36,37,38,39"adfd
Now I want to have the following result
COLUMN01 COLUMN02
-----------------------------------------------------------
1 11444,12,13,14,15,16,17,18,19
2 21,22,23,24,25,26,27,28,28
3 3142,32,33,34,35,36,37,38,39
How can I do that?
Thanks so much
Here is one way (maybe not the best, but it seems to work). I am NOT a SQL guru...
First, create this SQL Function. It came from: Extract numbers from a text in SQL Server
create function [dbo].[GetNumbersFromText](#String varchar(2000))
returns table as return
(
with C as
(
select cast(substring(S.Value, S1.Pos, S2.L) as int) as Number,
stuff(s.Value, 1, S1.Pos + S2.L, '') as Value
from (select #String+' ') as S(Value)
cross apply (select patindex('%[0-9]%', S.Value)) as S1(Pos)
cross apply (select patindex('%[^0-9]%', stuff(S.Value, 1, S1.Pos, ''))) as S2(L)
union all
select cast(substring(S.Value, S1.Pos, S2.L) as int),
stuff(S.Value, 1, S1.Pos + S2.L, '')
from C as S
cross apply (select patindex('%[0-9]%', S.Value)) as S1(Pos)
cross apply (select patindex('%[^0-9]%', stuff(S.Value, 1, S1.Pos, ''))) as S2(L)
where patindex('%[0-9]%', S.Value) > 0
)
select Number
from C
)
Then, you can do something like this to get the results you were asking for. Note that I broke the query up into 3 parts for clarity. And, obviously, you don't need to declare the table variable and insert data into it.
DECLARE #tbl
TABLE (
COLUMN01 int,
COLUMN02 varchar(max)
)
INSERT INTO #tbl VALUES (1, 'abc"11444,12,13"efg"14,15"hij"16,17,18,19"opqr')
INSERT INTO #tbl VALUES (2, 'ahsdhg"21,22,23"ghshds"24,25"fgh"26,27,28,28"shgshsg')
INSERT INTO #tbl VALUES (3, 'xvd"3142,32,33"hty"34,35"okli"36,37,38,39"adfd')
SELECT COLUMN01, SUBSTRING(COLUMN02, 2, LEN(COLUMN02) - 1) as COLUMN02 FROM
(
SELECT COLUMN01, REPLACE(COLUMN02, ' ', '') as COLUMN02 FROM
(
SELECT COLUMN01, (select ',' + number as 'data()' from dbo.GetNumbersFromText(Column02) for xml path('')) as COLUMN02 FROM #tbl
) t
) tt
GO
output:
COLUMN01 COLUMN02
1 11444,12,13,14,15,16,17,18,19
2 21,22,23,24,25,26,27,28,28
3 3142,32,33,34,35,36,37,38,39
I know you want to do it using SQL. But ones I had nearly the same problem and getting this data to a string using a php or another language, than parsing is a way to do it. For example, you can use this kind of code after receiving the data into a string.
function clean($string) {
$string = str_replace(' ', '-', $string); // Replaces all spaces with hyphens.
$string = preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars.
return preg_replace('/-+/', '-', $string); // Replaces multiple hyphens with single one.
}
For more information you might want to look at this post that I retrieved the function: Remove all special characters from a string
As I said this is an easy way to do it, I hope this could help.

Extract multiple decimal numbers from string in T-SQL

I have a table in SQL Server Management Studio with columns containing ranges of numbers as strings. I am trying to find a way to extract the numeric values from the string and insert them into a new table.
For example, in the table I have the value 12.45% - 42.32% as a string. I'd like to be able to get 12.45 and 42.32 and insert them into a new table with columns min_percent and max_percent.
I found several ways to extract a single numeric value from a string using SQL, and also tried modifying the function from Extract numbers from a text in SQL Server (which returns multiple integers, but not decimals), but so far I haven't been able to get it working. Thanks in advance for any suggestions
Assuming your data is consistent, this should work fine, and has the added advantage of being easier on the eyes. Also consider decimal if you're going for precision.
select
cast(left(r, charindex('%', r) - 1) AS float) as minVal,
cast(replace(right(r, charindex('-', r) - 1), '%', '') as float) AS maxVal
from ( select '22.45% - 42.32%' as r ) as tableStub
The function is quite close. You just use numeric and add the point:
with C as
(
select cast(substring(S.Value, S1.Pos, S2.L) as decimal(16,2)) as Number,
stuff(s.Value, 1, S1.Pos + S2.L, '') as Value
from (select #String+' ') as S(Value)
cross apply (select patindex('%[0-9,.]%', S.Value)) as S1(Pos)
cross apply (select patindex('%[^0-9,.]%', stuff(S.Value, 1, S1.Pos, ''))) as S2(L)
union all
select cast(substring(S.Value, S1.Pos, S2.L) as decimal(16,2)),
stuff(S.Value, 1, S1.Pos + S2.L, '')
from C as S
cross apply (select patindex('%[0-9,.]%', S.Value)) as S1(Pos)
cross apply (select patindex('%[^0-9,.]%', stuff(S.Value, 1, S1.Pos, ''))) as S2(L)
where patindex('%[0-9,.]%', S.Value) > 0
)
select Number
from C
Here is a brute force approach using the string operations available in SQL Server:
with t as (
select '12.45% - 42.32%' as val
)
select cast(SUBSTRING(val, 1, charindex('%', val) - 1) as float) as minval,
cast(replace(substring(val, len(val) - charindex(' ', reverse(val))+2, 100), '%', '') as float) as maxval
from t