Trimming a string until a value - sql

I ve got a column which has different sized strings, looking like:
abcd, efgh, ijkl2, 2345, xyzw
I need to trim it into 2 different columns, and get the string before the comma from right and then the other one, so I will have 2 other columns with:
2345 xyz
I ve tried to get only the first part of the string before the first comma:
RTRIM(LTRIM(RIGHT(A.[column],charindex(',',A.[column]+',')-1))) as 'aa'
RIGHT(A.[column], len(A.[column]) - charindex(',',A.[column])) as 'ab'
But I get it mixed, sometimes I get some of the values after the comma, but incomplete.
Any thoughts?
Thank you, I appreciate it.

Such string functions in SQL Server are really painful. I like to use apply for such operations, because this makes it easier to keep intermediate results.
For this particular problem:
select v.str, v2.lastval, v3.secondlastval
from (values ('abcd, efgh, ijkl2, 2345, xyzw')) v(str) cross apply
(select stuff(str, 1, len(str) - charindex(',', reverse(str)) + 2, ''),
stuff(str, len(str) - charindex(',', reverse(str)) + 1, len(str), '')
) v2(lastval, rest) cross apply
(select stuff(v2.rest, 1, len(v2.rest) - charindex(',', reverse(v2.rest)) + 2, ''),
stuff(v2.rest, len(v2.rest) - charindex(',', reverse(v2.rest)) + 1, len(v2.rest), '')
) v3(secondlastval, rest);
Having said that, you should probably reconsider your data structure. Storing lists of values in delimited strings is a poor data structure in SQL because SQL does not have very good string processing capabilities. Instead, you should be using a junction/association table.

you can use string_split if you are one sql server 2016...
try Something like this :
drop TABLE REC.TestTable;
CREATE TABLE REC.TestTable (id int identity, --Avoid using keywords for column names
str varchar(500));
GO
INSERT INTO REC.TestTable (str)
VALUES ('abcd, efgh, ijkl2, 123, ijk'),
('abcd, efgh, ijkl2, 4567, lmn'),
('abcd, efgh, ijkl2, 89, opqr')
;
GO
with tmp as (
select f1.*, f2.value, row_number() over(partition by id order by (select null)) position
from REC.TestTable f1 cross apply STRING_SPLIT(str, ',') f2
)
select f1.*, f2.value ValPos1, f3.value ValPos2, f4.value ValPos3, f5.value ValPos4, f6.value ValPos5
from REC.TestTable f1
left outer join tmp f2 on f1.id=f2.id and f2.position =1
left outer join tmp f3 on f1.id=f3.id and f3.position =2
left outer join tmp f4 on f1.id=f4.id and f4.position =3
left outer join tmp f5 on f1.id=f5.id and f5.position =4
left outer join tmp f6 on f1.id=f6.id and f6.position =5

Related

Split one column with weird string into multiple columns by specific delimiter in single select sql

I am seeking/hoping for a simpler solution, although I got a working solution already.
But it is hard for me to accept, that this is the only way. Therefore my hope is, that someone who is a good sql poweruser may have a better idea.
Background:
A simple table looking like that:
weirdstring
ID
A;GHL+BH;BC,NA-NB,[AB]
1
B;GHL+BH;BC,NA-NB,[AB]
2
C;GHL+BH;BC,NA-NB,[AB]
3
CREATE TABLE TESTTABLE (weirdstring varchar(MAX),
ID int);
INSERT INTO TESTTABLE
VALUES ('A;GHL+BH;BC,NA-NB,[AB]', 1);
INSERT INTO TESTTABLE
VALUES ('B;GHL+BH;BC,NA-NB,[AB]', 2);
INSERT INTO TESTTABLE
VALUES ('C;GHL+BH;BC,NA-NB,[AB]', 3);
All I need in the end is the first 3 "letter-groups" (1-3 letterst) from weirdstring (eg.ID 1 = A,GHL and BH, the rest of the string is not important now) in seperate columns:
ID
weirdstring
group1
group2
group3
1
A;GHL+BH;BC,NA-NB,[AB]
A
GHL
BH
2
B;GHL+BH;BC,NA-NB,[AB]
B
GHL
BH
3
C;GHL+BH;BC,NA-NB,[AB]
C
GHL
BH
What have been done so far is:
change all weird delimiters(;+- and potential more) in the string to comma, eliminate the brackets around "letter-groups". REPLACE daisy-chained is being used. So from A;GHL+BH;BC,NA-NB,[AB] to
A,GHL,BH,BC,NA,NB,AB first.
split the new string to columns by comma as delimiter.
The query used is:
SELECT t1.ID,
t1.weirdstring,
t2.group1,
t2.group2,
t2.group3
FROM TESTTABLE t1
LEFT JOIN (SELECT grp1.ID,
grp1.weirdstring AS group1,
grp2.weirdstring AS group2,
grp3.weirdstring AS group3
FROM (SELECT ID,
weirdstring
FROM (SELECT ID,
weirdstring,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY (SELECT NULL)) AS ROWNUM
FROM (SELECT ID,
value AS weirdstring
FROM TESTTABLE
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(weirdstring, '[', ''), ']', ''), ';', ','), '+', ','), '-', ','), '.', ','), ',')
WHERE weirdstring IS NOT NULL) splitted ) s1
WHERE ROWNUM = 1) grp1
LEFT JOIN (SELECT ID,
weirdstring
FROM (SELECT ID,
weirdstring,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY (SELECT NULL)) AS ROWNUM
FROM (SELECT ID,
value AS weirdstring
FROM TESTTABLE
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(weirdstring, '[', ''), ']', ''), ';', ','), '+', ','), '-', ','), '.', ','), ',')
WHERE weirdstring IS NOT NULL) splitted ) s2
WHERE ROWNUM = 2) grp2 ON grp1.ID = grp2.ID
LEFT JOIN (SELECT ID,
weirdstring
FROM (SELECT ID,
weirdstring,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY (SELECT NULL)) AS ROWNUM
FROM (SELECT ID,
value AS weirdstring
FROM TESTTABLE
CROSS APPLY STRING_SPLIT(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(weirdstring, '[', ''), ']', ''), ';', ','), '+', ','), '-', ','), '.', ','), ',')
WHERE weirdstring IS NOT NULL) splitted ) s3
WHERE ROWNUM = 3) grp3 ON grp3.ID = grp2.ID) t2 ON t1.ID = t2.ID;
But I could not believe how much of a query have been created in the end for my small task. At least I believe its small. I am on an older version (14) of sql-server and therefore I cannot use string_split with its third parameter (enable-ordinal) Syntax:
STRING_SPLIT ( string , separator [ , enable_ordinal ] )
Note: https://learn.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql?view=sql-server-ver16 : The enable_ordinal argument and ordinal output column are currently supported in Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics (serverless SQL pool only). Beginning with SQL Server 2022 (16.x) Preview, the argument and output column are available in SQL Server.
Is there some other, shorter ways to achieve the same results? I know that topic has been discussed many many times, but I could not find a solution to my specific problem here. Thanks in advance for any kind of help!
It seems that you are using SQL Server 2017 (v.14), so a possible option is the following JSON-based approach. The idea is to transform the stored text into a valid JSON array (A;GHL+BH;BC,NA-NB,[AB] into ["A","GHL","BH","BC","NA","NB","AB"]) using TRANSLATE() for character replacement and get the expected parts of the string using JSON_VALUE():
SELECT
weirdstring,
JSON_VALUE(jsonweirdstring, '$[0]') AS group1,
JSON_VALUE(jsonweirdstring, '$[1]') AS group2,
JSON_VALUE(jsonweirdstring, '$[2]') AS group3
FROM (
SELECT
weirdstring,
CONCAT('["', REPLACE(TRANSLATE(weirdstring, ';+-,[]', '######'), '#', '","'), '"]') AS jsonweirdstring
FROM TESTTABLE
) t

SQL Server 2014 : Convert two comma separated string into two columns

I have two comma-separated string which needs to be converted into a temptable with two columns synchronized based on the index.
If the input string as below
a = 'abc,def,ghi'
b = 'aaa,bbb,ccc'
then output should be
column1 | column2
------------------
abc | aaa
def | bbb
ghi | ccc
Let us say I have function fnConvertCommaSeparatedStringToColumn which takes in comma-separated string and delimiter as a parameter and returns a column with values. I use this on both strings and get two columns to verify if the count is the same on both sides. But it would be nice two have them in a single temp table. How can i do that?
Let us say I have function which ... returns a column with values.
At that point, the basic idea is to select the column and use the row_number() function with both of your strings. Then you can JOIN the two together using the row_number() result as the matching field for the join.
One method is a recursive CTE:
with cte as (
select convert(varchar(max), null) as a_part, convert(varchar(max), null) as b_part,
convert(varchar(max), 'abc,def,ghi') + ',' as a,
convert(varchar(max), 'aaa,bbb,ccc') + ',' as b,
0 as lev
union all
select convert(varchar(max), left(a, charindex(',', a) - 1)),
convert(varchar(max), left(b, charindex(',', b) - 1)),
stuff(a, 1, charindex(',', a), ''),
stuff(b, 1, charindex(',', b), ''),
lev + 1
from cte
where a <> '' and lev < 10
)
select a_part, b_part
from cte
where lev > 0;
Here is a db<>fiddle.
Here's something a bit sneaky you can try.
I don't have your bespoke function so have used the built-in string_split function (SQL2016+) - for quickly testing, but assuming the parameters are the same. Ideally, your bespoke function should return its own row number in which case you'd use that instead of a rownumber function.
declare #a varchar(20)='abc,def,ghi', #b varchar(20)='aaa,bbb,ccc';
with v as (
select a.value A,b.value B,
row_number() over(partition by a.value order by (select 1/0))Arn,
row_number() over(partition by b.value order by (select 1/0))Brn
from fnConvertCommaSeparatedStringToColumn (#a,',')a
cross apply fnConvertCommaSeparatedStringToColumn (#b,',')b
)
select A,B from v
where Arn=Brn
I would suggest getting a (set based) function that can split a string, based on a delimiter, that returns the ordinal position as well. For example DelimitedSplit8k_LEAD. Then you can trivially split the value, and JOIN on the ordinal position:
DECLARE #a varchar(100) = 'abc,def,ghi';
DECLARE #b varchar(100) = 'aaa,bbb,ccc';
SELECT A.Item AS A,
B.Item AS B
FROM dbo.delimitedsplit8k_lead(#a,',') A
FULL OUTER JOIN dbo.delimitedsplit8k_lead(#a,',') B ON A.ItemNumber = B.ItemNumber;
db<>fiddle
I use a FULL OUTER JOIN and then if either column has a NULL value you know that the 2 delimited lists don't have the same number of delimited values.

Segregate column values in SQL Server

I have a table with 2 columns (Col1 & Col2) and values are stores like below:
Col1 Col2
A/B/C Red/Orange/Green
D/E Red/Orange
I want the output like below.
Col1 Col2
A Red
B Orange
C Green
D Red
E Orange
Did you try CROSS APPLY? Please replace 'your_table_name' with the name of your table. It should work, just copy and paste.
SELECT Col1, value AS Col2 INTO Table_2
FROM your_table_name
CROSS APPLY STRING_SPLIT(Col2, '/');
SELECT Col2, value AS Col1 INTO Table_3
FROM Table_2
CROSS APPLY STRING_SPLIT(Col1, '/');
SELECT * FROM Table_3;
Not easy, but doable.
I would do it by "flattening" the table:
SELECT (left bit of column 1), (left bit of column2)
UNION ALL
SELECT (middle bit of column 1), (middle bit of column 2)
where [column 1] like '%/%'
UNION ALL
SELECT (last bit of column 1), (last bit of column 2)
where [column 1] like '%/%/%'
If you have the possibility of more slashes and data, you need to add further UNIONs.
Use CHARINDEX to find the slash and SUBSTRING to extract the bits.
Maybe String split can help?
https://learn.microsoft.com/it-it/sql/t-sql/functions/string-split-transact-sql?view=sql-server-ver15
look at the example D and E
Unfortunately, the built-in string split function in SQL Server does NOT return the position in the string. In my opinion, this is a significant oversight.
Assuming your strings have no duplicate values, you can use row_number() and charindex() to add an enumeration:
select t.*, ss.*
from t cross apply
(select s1.value as value1, s2.value as value2
from (select s1.value,
row_number() over (order by charindex('/' + s1.value + '/', '/' + t.col1 + '/')) as pos
from string_split(t.col1, '/') s1
) s1 join
(select s2.value,
row_number() over (order by charindex('/' + s2.value + '/', '/' + t.col2 + '/')) as pos
from string_split(t.col2, '/') s2
) s2
on s1.pos = s2.pos
) ss;
Here is a db<>fiddle.

ORDER BY specific numerical value in string [SQL]

Have a column ID that I would like to ORDER in a specific format. Column has a varchar data type and always has an alphabetic value, typically P in front followed by three to four numeric values. Possibly even followed by an underscore or another alphabetic value. I have tried few options and none are returning what I desire.
SELECT [ID] FROM MYTABLE
ORDER BY
(1) LEN(ID), ID ASC
/ (2) LEFT(ID,2)
OPTIONS TRIED (3) SUBSTRING(ID,2,4) ASC
\ (4) ROW_NUMBER() OVER (ORDER BY SUBSTRING(ID,2,4))
(5) SUBSTRING(ID,PATINDEX('%[0-9]%',ID),LEN(ID))
(6) LEFT(ID, PATINDEX('%[0-9]%', ID)-1)
Option 1 seems to be closest to what I am looking for except when an _ or Alphabetic values follow the numeric value. See results from Option 1 below
P100
P208
P218
P301
P305
P306
P4200
P4510
P4511
P4512
P5011
P1400A
P4125H
P4202A
P4507L
P4706A
P1001_2
P2103_B
P4368_RL
Would like to see..
P100
P208
P218
P301
P305
P306
P1001_2
P1400A
P2103_B
P4125H
P4200
P4202A
P4368_RL
P4507L
P4510
P4511
P4512
P4706A
P5011
ORDER BY
CAST(SUBSTRING(id, 2, 4) AS INT),
SUBSTRING(id, 6, 3)
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9464
And one that's still less complex than a getOnlyNumbers() UDF, but copes with varying length of numeric part.
CROSS APPLY
(
SELECT
tail_start = PATINDEX('%[0-9][^0-9]%', id + '_')
)
stats
CROSS APPLY
(
SELECT
numeric = CAST(SUBSTRING(id, 2, stats.tail_start-1) AS INT),
alpha = RIGHT(id, LEN(id) - stats.tail_start)
)
id_tuple
ORDER BY
id_tuple.numeric,
id_tuple.alpha
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9499
Finally, one that can cope with there being no number at all (but still assumes the first character exists and should be ignored).
CROSS APPLY
(
SELECT
tail_start = NULLIF(PATINDEX('%[0-9][^0-9]%', id + '_'), 0)
)
stats
CROSS APPLY
(
SELECT
numeric = CAST(SUBSTRING(id, 2, stats.tail_start-1) AS INT),
alpha = RIGHT(id, LEN(id) - ISNULL(stats.tail_start, 1))
)
id_tuple
ORDER BY
id_tuple.numeric,
id_tuple.alpha
http://sqlfiddle.com/#!6/9eecb7db59d16c80417c72d1e1f4fbf1/9507
This is a rather strange way to sort but now that I understand it I figured out a solution. I am using a table valued function here to strip out only the numbers from a string. Since the function returns all numeric characters I also need to check for the _ and only pass in the part of the string before that.
Here is the function.
create function GetOnlyNumbers
(
#SearchVal varchar(8000)
) returns table as return
with MyValues as
(
select substring(#SearchVal, N, 1) as number
, t.N
from cteTally t
where N <= len(#SearchVal)
and substring(#SearchVal, N, 1) like '[0-9]'
)
select distinct NumValue = STUFF((select number + ''
from MyValues mv2
order by mv2.N
for xml path('')), 1, 0, '')
from MyValues mv
This function is using a tally table. If you have one you can tweak that code slightly to fit. Here is my tally table. I keep it as a view.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
GO
Next of course we need to have some data to work. In this case I just created a table variable to represent your actual table.
declare #Something table
(
SomeVal varchar(10)
)
insert #Something values
('P100')
, ('P208')
, ('P218')
, ('P301')
, ('P305')
, ('P306')
, ('P4200')
, ('P4510')
, ('P4511')
, ('P4512')
, ('P5011')
, ('P1400A')
, ('P4125H')
, ('P4202A')
, ('P4507L')
, ('P4706A')
, ('P1001_2')
, ('P2103_B')
, ('P4368_RL')
With all the legwork and setup behind us we can get to the actual query needed to accomplish this.
select s.SomeVal
from #Something s
cross apply dbo.GetOnlyNumbers(case when charindex('_', s.SomeVal) = 0 then s.SomeVal else left(s.SomeVal, charindex('_', s.SomeVal) - 1) end) x
order by convert(int, x.NumValue)
This returns the rows in the order you listed them in your question.
You can break down ID in steps to extract the number. Then, order by the number and ID. I like to break down long string manipulation into steps using CROSS APPLY. You can do it inline (it'd be long) or bundle it into an inline TVF.
SELECT t.*
FROM MYTABLE t
CROSS APPLY (SELECT NoP = STUFF(ID, 1, 1, '')) nop
CROSS APPLY (SELECT FindNonNumeric = LEFT(NoP, ISNULL(NULLIF(PATINDEX('%[^0-9]%', NoP)-1, -1), LEN(NoP)))) fnn
CROSS APPLY (SELECT Number = CONVERT(INT, FindNonNumeric)) num
ORDER BY Number
, ID;
I think your best bet is to create a function that strips the numbers out of the string, like this one, and then sort by that. Even better, as #SeanLange suggested, would be to use that function to store the number value in a new column and sort by that.

SQL - Point differences between two lists

Given two comma separated (un-ordered) lists of numbers, I want to extract only the differences between them (using regexp probably).
e.g.:
select
'1010484,1025781,1051394,1069679' as list_1,
'1005923,1010484,1025781,1034010,1044261,1048311,1051394' as list_2
What I wish for is a result such as:
l1_additional_data: 1069679
l2_additional_data: 1005923,1034010,1044261,1048311
How can this be done?
I'm using Vertica, BTW - That means that no hierarchic ("connect by") queries could be used here.
Thanks in advance!
There's a relevant post that will be helpful - Splitting string into multiple rows in Oracle
I don't know vertica but based on oracle You could go with:
with list1 as
(
select
regexp_substr(list_1 ,'[^,]+', 1, level) as list_1_rows
from (
select
'1010484,1025781,1051394,1069679' as list_1
from dual)
connect by
regexp_substr(list_1 ,'[^,]+', 1, level) is not null),
list2 as (select
regexp_substr(list_2 ,'[^,]+', 1, level) as list_2_rows
from (
select
'1005923,1010484,1025781,1034010,1044261,1048311,1051394' as list_2
from dual)
connect by regexp_substr(list_2 ,'[^,]+', 1, level) is not null)
select * from list1
full outer join list2
on list1.list_1_rows = list2.list_2_rows
where list_1_rows is null or list_2_rows is null
OK, Here's my solution - But it's not very efficient, and it probably won't scale (in terms of performance):
WITH lists AS
(SELECT'1010484,1025781,1051394,1069679' AS list_1, '1005923,1010484,1025781,1034010,1044261,1048311,1051394' AS list_2 )
, numbers AS
(SELECT row_number() over() i
FROM system_columns limit 100)
SELECT group_concat(parsed_code_1) list_1_additions,
group_concat(parsed_code_2) list_2_additions
FROM(SELECT parsed_code_1
FROM(SELECT split_part(list_1, ',', i) parsed_code_1
FROM lists
CROSS JOIN numbers
WHERE i <= regexp_count(list_1, ',')+1) l
WHERE parsed_code_1 IS NOT NULL) a
FULL OUTER JOIN
(SELECT parsed_code_2
FROM(SELECT split_part(list_2, ',', i) parsed_code_2
FROM lists
CROSS JOIN numbers
WHERE i <= regexp_count(list_2, ',')+1) l
WHERE parsed_code_2 IS NOT NULL) b ON(parsed_code_1 = parsed_code_2)
WHERE parsed_code_1 IS NULL OR parsed_code_2 IS NULL