Split a column into multiple columns in a SQL query using pipe delimiter - sql

I have below data in a column of a table, I want to split it into further columns.
| is used as the separator in this scenario . Column header should be before : & after column is its value.
Column
-----------------------------------------------------------------------------
ID: 30000300 | Name: India | Use: New Use
ID: 30000400 | Name: Aus | New ID: 15625616 | Address 1: NEW Rd
ID: 30000400 | Name: USA | City: VIA ARAMAC | New ID: 123
ID: 30000500 | Name: Russia | New ID: 15624951 | Address 2: 2131 BEAUDESERT
Output should be:
ID Name Use New ID City Address 1 Address 2 New City
----------------------------------------------------------------------
30000300 India New Use
30000400 Aus 15625616 NEW Rd
30000400 USA 15625616 VIA ARAMAC GALILEE
30000500 Russia 15624951 2131 BEAUDESERT

You have several rows that contain key value pairs inside an nvarchar column, but you want a table that has a header based on the keys and then rows containing just the values, sans keys. There is first the issue of an input like Key1: Value1 | Key2: Value2. Should this be returned as
Key1 Key2
Value1 NULL
NULL Value2
or is this not a possible scenario? Either way, there is the issue of generating a table with dynamic column names.
The problem with your question is that this is not a scenario that would normally be solved via SQL. You should get the data in your programming language of choice, then use regular expressions or split methods to get what you need.
If you insist doing it via SQL, then the solution is to turn the original lines input into another string, that you then sp_executesql (https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-executesql-transact-sql), but I do NOT recommend it.

Here is a partial answer that you can use to return the n-th entry in a delimited string:
DECLARE #DelimitedString VARCHAR(8000);
DECLARE #Delimiter VARCHAR(100);
DECLARE #indexToReturn INT;
DECLARE #tblArray TABLE
(
ElementID INT IDENTITY(1, 1), -- Array index
Element VARCHAR(1000) -- Array element contents
);
-- Local Variable Declarations
-- ---------------------------
DECLARE #Index SMALLINT,
#Start SMALLINT,
#DelSize SMALLINT;
SET #DelSize = LEN(#Delimiter + 'x') - 1;
-- Loop through source string and add elements to destination table array
-- ----------------------------------------------------------------------
WHILE LEN(#DelimitedString) > 0
BEGIN
SET #Index = CHARINDEX(#Delimiter, #DelimitedString);
IF #Index = 0
BEGIN
INSERT INTO #tblArray
(
Element
)
VALUES
(LTRIM(RTRIM(#DelimitedString)));
BREAK;
END;
ELSE
BEGIN
INSERT INTO #tblArray
(
Element
)
VALUES
(LTRIM(RTRIM(SUBSTRING(#DelimitedString, 1, #Index - 1))));
SET #Start = #Index + #DelSize;
SET #DelimitedString = SUBSTRING(#DelimitedString, #Start, LEN(#DelimitedString) - #Start + 1);
END;
END;
DECLARE #val VARCHAR(1000);
SELECT #val = Element
FROM #tblArray AS ta
WHERE ta.ElementID = #indexToReturn;
SELECT #val;

Related

Handling variables in SQL Server

I have a procedure that needs to handle up to 60 different variables.
The variables have a standardized naming convention.
#TextParameter1 varchar(443) = NULL,
#TextParameter2 varchar(443) = NULL,
#TextParameter3 varchar(443) = NULL
I need to be able to check which variables are NULL and which aren't, and then handle the values of the non-null variables.
I tried using dynamic SQL to iterate over the variables by making the first portion of the variable name a string and iterating through the numbers on the end.
declare #rownum int = 1
while #rownum <= 60
declare #var_sql nvarchar(max) = 'INSERT INTO #slicer
SELECT IDENTITY(Int, 1, 1) AS rowkey, value
FROM STRING_SPLIT(CAST(#DetailQueryTextParameter' + CAST(#rownum AS nvarchar(3)) AS varchar(4000)), '^')'
execute #var_sql
Set #rownum = #rownum + 1
This will return an error claiming that #DetailQueryTextParameter needs to be declared first. What is the best way to handle all of these variables? I could do by writing a line of code for every single variable, but it seems like there is a better way. Can I insert the variable names into a table and iterate from there?
(The below was updated to include the SPLIT_STRING().)
You can use a VALUES subselect (not sure if that is the proper term) to combine all of your parameters into a single collection. You can then filter out the null values and pass the remaining values into STRING_SPLIT() using a CROSS APPLY.
DECLARE
#TextParameter1 varchar(443) = NULL,
#TextParameter2 varchar(443) = 'aaa^bbb',
#TextParameter3 varchar(443) = NULL,
#TextParameter4 varchar(443) = 'xxx^yyy^zzz',
#TextParameter5 varchar(443) = NULL
SELECT A.*, B.Value AS SplitValue
FROM (
VALUES
(1, #TextParameter1),
(2, #TextParameter2),
(3, #TextParameter3),
(4, #TextParameter4),
(5, #TextParameter5)
) A(Parameter, Value)
CROSS APPLY STRING_SPLIT(A.Value, '^') B
WHERE A.Value IS NOT NULL
Which would yield results like:
Parameter
Value
SplitValue
2
aaa^bbb
aaa
2
aaa^bbb
bbb
4
xxx^yyy^zzz
xxx
4
xxx^yyy^zzz
yyy
4
xxx^yyy^zzz
zzz
See this db<>fiddle.

How can I find in SQL for a match of 15 characters out of 17?

I have a table where a field is varchar(17) and contains 17 char long phrases. They are alphanumeric. I need to add another row this table, but before I do, I need to check for possible duplicates. The logic for duplicates needs to be if any 15 characters of the new phrase matches with any 15 characters to a record in the table.
Samples:
5 Table rows:
1N6BF0KM2HN802620
1N6BF0KMXHN801974
1N6BF0LYXHN811101
1N9BF0KM6HN800482
1N12F0LY4HN809375
New phrase to add: 1N6BF0KAXHN802974
Found in table row 2: 1N6BF0KMXHN801974
Duplicate: Does not match 2 charaters only
New phrase to add: 109BF0KM6HN800492
Found in table row 4: 1N9BF0KM6HN800482
Duplicate: Does not match 2 charaters only
New phrase to add: 1N12F0LY4HN709375
Found in table row 5: 1N12F0LY4HN809375
Not Duplicate: Does not match 1 charaters
New phrase to add: 1N6AF0BYXHN911101
Found in table row 3: 1N6BF0LYXHN811101
Not Duplicate: Does not match more than 2 charaters
I try searching for regular expression or any algorism.
Here is a solution using
a function which compares a pair of string character by character and returns the number of matches
which uses the function to compare a string with all the values already in the table before entering it.
I don't think that regex is appropriate because we would need to count the number of matches with different regex pattern fir each existing entry. Regex is good for matching to a pattern which doesn't change.
create table sample (Samples char(17));
insert into sample values
('1N6BF0KM2HN802620'),
('1N6BF0KMXHN801974'),
('1N6BF0LYXHN811101'),
('1N9BF0KM6HN800482'),
('1N12F0LY4HN809375');
CREATE FUNCTION dbo.countMatches
(
#string1 CHAR(17) ,
#string2 CHAR(17)
)
returns int
AS
BEGIN
DECLARE #IsMatching int =0 ;
DECLARE #loopCount INT = 0;
SET #IsMatching = 0;
WHILE #loopCount < 17
BEGIN
IF(
SUBSTRING(#string1,#loopCount+1,1)
= SUBSTRING(#string2,#loopCount+1,1)
)
BEGIN
SET #IsMatching = #IsMatching + 1;
END;
SET #loopCount = #loopCount + 1;
END
RETURN #IsMatching
END
create procedure insertSample (#newSample char(17))
as
begin
DECLARE #sample char(17);
declare #IsMatching int;
declare #thisMatching int = 0;
DECLARE sample_Cursor CURSOR
FOR SELECT samples FROM sample
OPEN sample_Cursor
FETCH NEXT FROM sample_Cursor INTO #sample
WHILE ##FETCH_STATUS = 0
BEGIN
set #IsMatching=dbo.countMatches(
#sample, #newSample);
FETCH NEXT FROM sample_Cursor INTO #sample
if (#IsMatching > #thisMatching)
begin
set #thisMatching = #IsMatching;
end
END
CLOSE sample_Cursor
DEALLOCATE sample_Cursor
if (#thisMatching > 14)
RAISERROR (
'too many matching characters',16, 1)
else
begin
insert into sample values(#newSample);
select 'ok accepted' result,
#thisMatching numMatches;
end
RETURN;
end
EXEC insertSample '1N6BF0KAXHN802974'
GO
Msg 50000 Level 16 State 1 Line 26
too many matching characters
EXEC insertSample '109BF0KM6HN800492'
GO
Msg 50000 Level 16 State 1 Line 26
too many matching characters
EXEC insertSample '1N12F0LY4HN709375'
GO
Msg 50000 Level 16 State 1 Line 26
too many matching characters
EXEC insertSample '1N6AF0BYXHN911101'
GO
result | numMatches
:---------- | ---------:
ok accepted | 14
select * from sample;
GO
| Samples |
| :---------------- |
| 1N6BF0KM2HN802620 |
| 1N6BF0KMXHN801974 |
| 1N6BF0LYXHN811101 |
| 1N9BF0KM6HN800482 |
| 1N12F0LY4HN809375 |
| 1N6AF0BYXHN911101 |
SELECT dbo.countMatches(
'1N12F0LY4HN709375',
'1N12F0LY4HN709366');
| (No column name) |
| ---------------: |
| 15 |
db<>fiddle here

Removing all but one of a certain character in a string

I have an issue where I'm trying to remove all of the '.' from the string/filename below in SSMS apart from the last one which dictates file type.
EPC 14.10.14.pdf
Ideally I would like this string to appear as below:
EPC 141014.pdf
Any help would be appreciated
As a variable :
declare #doc varchar(30) = 'EPC 14.10.14.pdf'
declare #ext varchar(8) = right(#doc, charindex('.', reverse(#doc)));
set #doc = concat(replace(left(#doc,len(#doc)-len(#ext)),'.',''), #ext);
select #doc as doc;
doc
EPC 141014.pdf
As a table column :
create table test (
doc varchar(30) not null
);
insert into test (doc) values
('EPC 14.10.14.pdf'),
('FQD 15.11.15.jpeg');
select doc
, undotted_doc = concat(replace(left(doc, len(doc)-charindex('.', reverse(doc))),'.',''), right(doc, charindex('.', reverse(doc))))
from test;
doc
undotted_doc
EPC 14.10.14.pdf
EPC 141014.pdf
FQD 15.11.15.jpeg
FQD 151115.jpeg
Test on db<>fiddle here
Use replace,substring and len function
select replace(substring(#x,0,len(#x) - 3),'.','') + substring(#x,len(#x) - 3,len(#x))
EDIT:
If the name extension has a variable length, you can use the following query
select
CONCAT(
replace(substring(#x,0,len(#x) - CHARINDEX('.',TRIM(REVERSE(#x)))),'.','')
,
substring(#x,len(#x) - CHARINDEX('.',TRIM(REVERSE(#x))),len(#x))
)
Result
If you have extensions with different length (e.g. docx, xls), you need to find the index of the last occurrence of the . character using REVERSE() and CHARINDEX():
SELECT CONCAT(
REPLACE(SUBSTRING(SomeText, 1, LEN(SomeText) - CHARINDEX('.', REVERSE(SomeText))), '.', ''),
STUFF(SomeText, 1, LEN(SomeText) - CHARINDEX('.', REVERSE(SomeText)), '')
) AS FileName
FROM (VALUES
('EPC 14.10.14.pdf'),
('EPC 14.10.14.docx'),
('14.10.14.xlsx')
) t (SomeText)
Result:
FileName
----------------
EPC 141014.pdf
EPC 141014.docx
141014.xlsx
One more way.
SQL
SELECT fileName AS [Before]
, CONCAT(CONCAT(PARSENAME(fileName,4), PARSENAME(fileName,3), PARSENAME(fileName,2))
, '.', PARSENAME(fileName,1)) AS [After]
FROM (VALUES
('EPC 14.10.14.pdf'),
('EPC 14.10.14.docx'),
('14.10.14.xlsx'),
('csharp.10.14.cs')
) AS t(fileName);
Output
+-------------------+-----------------+
| Before | After |
+-------------------+-----------------+
| EPC 14.10.14.pdf | EPC 141014.pdf |
| EPC 14.10.14.docx | EPC 141014.docx |
| 14.10.14.xlsx | 141014.xlsx |
| csharp.10.14.cs | csharp1014.cs |
+-------------------+-----------------+

Find and replace by pattern

I have a table that has a column like the one below
url
----------------
dir=mp3\cat152AB&fileName=file-01.mp3
dir=mp3\cat2500DfDD00&fileName=file-02.mp3
dir=mp3\cat4500f0655&fileName=file-03.mp3
...
How can I delete extra strings and arrange the fields as follows in SQL Server.
url
----------------
file-01
file-02
file-03
...
you can use charindex and substring :
SELECT substring ('dir=mp3\cat152AB&fileName=file-01.mp3', CHARINDEX('fileName=', 'dir=mp3\cat152AB&fileName=file-01.mp3') +9 ,
LEN('dir=mp3\cat152AB&fileName=file-01.mp3')-CHARINDEX('fileName=', 'dir=mp3\cat152AB&fileName=file-01.mp3')
) AS MatchPosition;
CHARINDEX and SUBSTRING can help you, please check the example:
select substring (field, charindex (';fileName=', field) + len (';fileName='), len (field) - len ('.mp3') + 1 - charindex (';fileName=', field) - len (';fileName='))
from (
select 'dir=mp3\cat152AB&fileName=file-01.mp3' field union all
select 'dir=mp3\cat2500DfDD00&fileName=file-02.mp3' union all
select 'dir=mp3\cat4500f0655&fileName=file-03.mp3'
) a
The information you want always seems to be the 11th to 5th characters before the end of the string. I would suggest a simple solution:
select left(right(url, 11), 7)
Here is a db<>fiddle.
Please try the following method.
It is using tokenization via XML/XQuery.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, url VARCHAR(255));
INSERT INTO #tbl (url) VALUES
('dir=mp3\cat152AB&fileName=file-01.mp3'),
('dir=mp3\cat2500DfDD00&fileName=file-02.mp3'),
('dir=mp3\cat4500f0655&fileName=file-03.mp3');
-- DDL and sample data population, end
DECLARE #separator CHAR(1) = '=';
SELECT id, url
, LEFT(x, CHARINDEX('.', x) - 1) AS Result
FROM #tbl
CROSS APPLY (SELECT CAST('<root><r><![CDATA[' +
REPLACE(url, #separator, ']]></r><r><![CDATA[') +
']]></r></root>' AS XML)) AS t1(c)
CROSS APPLY (VALUES (c.value('(/root/r[last()]/text())[1]', 'VARCHAR(100)'))) AS t2(x);
Output
+----+------------------------------------------------+---------+
| id | url | Result |
+----+------------------------------------------------+---------+
| 1 | dir=mp3\cat152AB&fileName=file-01.mp3 | file-01 |
| 2 | dir=mp3\cat2500DfDD00&fileName=file-02.mp3 | file-02 |
| 3 | dir=mp3\cat4500f0655&fileName=file-03.mp3 | file-03 |
+----+------------------------------------------------+---------+
I know we have an accepted answer but I wanted to chime in with another simple, high-performing solution that addresses file names and file extensions with various lengths. For fun I included a parameter that allows you to include the file extension if you choose.
--==== Easily Consumable Sample Data
DECLARE #link TABLE ([url] VARCHAR(100) UNIQUE);
INSERT #link VALUES ('dir=mp3\cat152AB&fileName=file-01.mp3'),
('dir=mp3\cat2500DfDD00&fileName=file-02.mp3'),
('dir=mp3\cat4500f0655&fileName=file-03.mp3'),
('dir=mp3\cat4500f0655&fileName=file-999.mp3'),
('dir=mp3\cat4500d9997&fileName=file-0021.prodigi');
--==== Allows you to determine if you want the file extension
DECLARE #exclude BIT=1;
SELECT l.[url], TheFile = SUBSTRING(l.[url], s.Pos, s.Ln-s.Pos- ((#exclude*(fl.Ln)-1)))
FROM #link AS l
CROSS APPLY (VALUES(CHARINDEX('.',REVERSE(l.[url])))) AS fl(Ln)
CROSS APPLY (VALUES(CHARINDEX('fileName=',l.[url])+9, LEN(l.[url]))) AS s(Pos,Ln);
#exclude=1 returns:
url TheFile
----------------------------------------------------- --------------
dir=mp3\cat152AB&fileName=file-01.mp3 file-01
dir=mp3\cat2500DfDD00&fileName=file-02.mp3 file-02
dir=mp3\cat4500d9997&fileName=file-0021.prodigi file-0021
dir=mp3\cat4500f0655&fileName=file-03.mp3 file-03
dir=mp3\cat4500f0655&fileName=file-999.mp3 file-999
#exclude=0 returns:
url TheFile
----------------------------------------------------- --------------
dir=mp3\cat152AB&fileName=file-01.mp3 file-01.mp3
dir=mp3\cat2500DfDD00&fileName=file-02.mp3 file-02.mp3
dir=mp3\cat4500d9997&fileName=file-0021.prodigi file-0021.prodigi
dir=mp3\cat4500f0655&fileName=file-03.mp3 file-03.mp3
dir=mp3\cat4500f0655&fileName=file-999.mp3 file-999.mp3

Parse SQL Column into separate columns

I'm looking to parse a sql column result into separate columns. Here is an example of the column...
Detail - Column name
'TaxID changed from "111" to "333". Address1 changed from "542 Test St." to "333 Test St". State changed from "FL" to "DF". Zip changed from "11111" to "22222". Country changed from "US" to "MX". CurrencyCode changed from "usd" to "mxn". RFC Number changed from "" to "test". WarehouseID changed from "6" to "1". '
I need to take the old TAXID, new TAXID, old country, and new country and put them in separate columns.
The Detail column will always have TAXID and Country, however the challenging part is that they don't always have the rest of data that I listed above. Sometimes it will contain city and other times it won't. This means the order is always different.
I would create a tsql proc, use a case statement.
Do a count of the double quotes. If there are 8 oairs, you know that you old and new values, only 4 pairs you only have new values.
Then using the double quotes as indexes for your substring, you can put the vales into the table.
Good luck!
I was able to come up with something that worked.
In case anyone else gets a situation like this again perhaps posting my code will help.
DECLARE #document varchar(350);
set #document = 'TaxID changed from "111" to "222"'
declare #FIRSTQUOTE int
declare #SECONDQUOTE int
declare #OLDTAXID nvarchar(40)
declare #firstlength int
declare #ThirdQuote int
declare #FourthQuote int
declare #secondlength int
declare #NewTAXID nvarchar(40)
declare #oneplussecondquote int
declare #oneplusthirdquote int
select #FirstQuote = CHARINDEX('"',#document)
set #FIRSTQUOTE = #FIRSTQUOTE + 1
select #SECONDQUOTE = CHARINDEX('"',#document,#FIRSTQUOTE)
set #firstlength = #SECONDQUOTE - #FIRSTQUOTE
select #OLDTAXID = SUBSTRING(#document,#FIRSTQUOTE,#firstlength)
set #oneplussecondquote = #SECONDQUOTE + 1
select #ThirdQuote = CHARINDEX('"',#document,#oneplussecondquote)
set #oneplusthirdquote = #ThirdQuote + 1
select #FourthQuote = CHARINDEX('"',#document,#oneplusthirdquote)
select #secondlength = #FourthQuote - #oneplusthirdquote
select #NewTAXID = SUBSTRING(#document,#oneplusthirdquote,#secondlength)
You can switch out the string for this: 'Country changed from "US" to "MX"'
And it would grab the old country and new country