Using STRING_SPLIT in SQL for delimited values - sql

#InStr = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0'
I have the string above in a variable #InStr and I want to use STRING_SPLIT to inserts values into a table.
As you can see its a double split.
SELECT Value FROM STRING_SPLIT(#InStr,'^')
Produces:
0|ABC|3033.9|3032.4444|0|0|0
1|DEF|3033.2577|3033.053|3032.0808|0|0
2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0
3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0
Which is good, now I need to take each row and insert into a table.
I'm not sure how to combine the 2 splits to do the insert. The table has 7 columns which it would populate.
Any help appreciated.

First of all: You should avoid STRING_SPLIT() in almost any case. It does not guarantee to return the items in the expected sort order. This might work in all your tests and break in production with silly hardly to find errors.
There are various answers already, the best one should be the table type parameter. But (if you cannot follow this route), I'd like to suggest two type-safe approaches:
DECLARE #InStr NVARCHAR(MAX) = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0';
--xml approach (working for almost any version)
--We do the double split in one single action and return a nested XML with <x> and <y> elements
--We can fetch the values type-safe from their 1-based position:
SELECT x.value('y[1]','int') AS [First]
,x.value('y[2]','varchar(100)') AS [Second]
,x.value('y[3]','decimal(28,8)') AS Third
,x.value('y[4]','decimal(28,8)') AS Fourth
,x.value('y[5]','decimal(28,8)') AS Fifth
,x.value('y[6]','decimal(28,8)') AS Sixth
,x.value('y[7]','decimal(28,8)') AS Seventh
FROM (VALUES(CAST('<x><y>' + REPLACE(REPLACE(#Instr,'|','</y><y>'),'^','</y></x><x><y>') + '</y></x>' AS XML)))v(Casted)
CROSS APPLY Casted.nodes('/x') b(x);
--json approach (needs v2016+)
--faster than XML
--We transform your string to a JSON-array with one item per row and use another OPENJSON to retrieve the array's items.
--The WITH-clause brings in implicit pivoting to retrieve the items type-safe as columns:
SELECT b.*
FROM OPENJSON(CONCAT('[["',REPLACE(#Instr,'^','"],["'),'"]]')) a
CROSS APPLY OPENJSON(CONCAT('[',REPLACE(a.[value],'|','","'),']'))
WITH([First] INT '$[0]'
,[Second] VARCHAR(100) '$[1]'
,[Third] DECIMAL(28,8) '$[2]'
,[Fourth] DECIMAL(28,8) '$[3]'
,[Fifth] DECIMAL(28,8) '$[4]'
,[Sixth] DECIMAL(28,8) '$[5]'
,[Seventh] DECIMAL(28,8) '$[6]') b;
Both approaches return the same result:
+-------+--------+---------------+---------------+---------------+---------------+------------+
| First | Second | Third | Fourth | Fifth | Sixth | Seventh |
+-------+--------+---------------+---------------+---------------+---------------+------------+
| 0 | ABC | 3033.90000000 | 3032.44440000 | 0.00000000 | 0.00000000 | 0.00000000 |
+-------+--------+---------------+---------------+---------------+---------------+------------+
| 1 | DEF | 3033.25770000 | 3033.05300000 | 3032.08080000 | 0.00000000 | 0.00000000 |
+-------+--------+---------------+---------------+---------------+---------------+------------+
| 2 | JHI | 3032.83760000 | 3033.25960000 | 3033.22590000 | 3033.32200000 | 0.00000000 |
+-------+--------+---------------+---------------+---------------+---------------+------------+
| 3 | XYZ | 3032.83760000 | 3032.83760000 | 3032.83760000 | 3032.83760000 | 0.00000000 |
+-------+--------+---------------+---------------+---------------+---------------+------------+

Instead of passing a string from .NET like 'a|b|c^d|e|f' and then having to parse it, leave it in its original structure (DataTable?) and create a table type in SQL Server. Then you can pass in your structure instead of this cobbled-together string.
In SQL Server:
CREATE TYPE dbo.MyTableType AS TABLE
(
ColumnA int,
ColumnB nvarchar(32),
...
);
GO
CREATE PROCEDURE dbo.ShowArray
#DataTable dbo.MyTableType
AS
BEGIN
SET NOCOUNT ON;
SELECT ColumnA, ColumnB, ...
FROM #DataTable;
END
In C# (untested and incomplete):
DataTable dt = new DataTable();
dt.Columns.Add("ColumnA", typeof(Int32));
dt.Columns.Add("ColumnB", typeof(String));
...
DataRow dr = dt.NewRow();
dr[0] = 1;
dr[1] = "foo";
...
dt.Rows.Add(dr);
...
SqlCommand cmd = new SqlCommand("dbo.ShowArray", connectionObject);
cmd.CommandType = CommandType.StoredProcedure;
SqlParameter tvp1 = c2.Parameters.AddWithValue("#DataTable", dt);
tvp1.SqlDbType = SqlDbType.Structured;
...
More on this shift away from splitting strings here and, actually, in this answer as well:
https://stackoverflow.com/a/11105413/61305

You can use a recursive CTE:
declare #instr varchar(max) = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0'
;
with cte as (
select row_number() over (order by (select null)) as id, convert(varchar(max), null) as el, Value + '|' as rest, 0 as lev
from string_split(#InStr, '^')
union all
select id, left(rest, charindex('|', rest) - 1),
stuff(rest, 1, charindex('|', rest), ''),
lev + 1
from cte
where rest <> ''
)
select max(case when lev = 1 then el end),
max(case when lev = 2 then el end),
max(case when lev = 3 then el end),
max(case when lev = 4 then el end),
max(case when lev = 5 then el end),
max(case when lev = 6 then el end),
max(case when lev = 7 then el end)
from cte
group by id;
Here is a db<>fiddle.
Unfortunately, you can't safely use string_split() because it does not provide the offset for the values returned.

For a subsequent splitting of pipe-separated substrings you can utilise openjson(), as demonstrated in the example below:
declare #InStr varchar(max) = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0';
select p.*
from (
select ss.value as [RowId], oj.[key] as [ColumnId], oj.value as [ColumnValue]
from string_split(#InStr,'^') ss
cross apply openjson('["' + replace(ss.value, '|', '","') + '"]', '$') oj
) q
pivot (
min(q.ColumnValue)
for q.[ColumnId] in ([0], [1], [2], [3], [4], [5], [6])
) p;
There are many caveats with this approach, however. The most prominent are:
You need SQL Server 2016 or later, and the database compatibility level needs to be 130 or above;
If your data is of any size worth mentioning (1Mb+), this code might work unacceptably slow. String manipulation is not the strong point of SQL Server.
Personally, I would recommend parsing this string outside of SQL. If it's a flat file you are importing, SSIS dataflow will be much easier to develop and faster to work. If it's an application, then redesign it to pass either a suitable table type, or XML / JSON blob at the very least.

I am generating INSERT statement and then executing it. First I am splitting the string and then I am generating INSERT statement.
Note:
I am assuming that second column will be three letter code.
I am
assuming that sort order of rows doesn't matter
declare #instr varchar(max) = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0'
;
declare #insertStmt VARCHAR(max) ='INSERT INTO TABLEName VALUES '+ CHAR(13) + CHAR(10);
SELECT #insertStmt += CONCAT('(',replace(stuff(stuff(value,3,0,''''),7,0,''''),'|',','),'),')
from STRING_SPLIT(#instr,'^')
SELECT #insertStmt = STUFF(#insertStmt,len(#insertStmt),1,'')
select #insertStmt
EXEC(#insertStmt)
INSERT INTO TABLEName VALUES
(0,'ABC',3033.9,3032.4444,0,0,0),(1,'DEF',3033.2577,3033.053,3032.0808,0,0),(2,'JHI',3032.8376,3033.2596,3033.2259,3033.322,0),(3,'XYZ',3032.8376,3032.8376,3032.8376,3032.8376,0)

Related

Extract string using SQL Server 2012

I have a string in the form of
<div>#FIRST#12345#</div>
How do I extract the number part from this string using T-SQL in SQL Server 2012? Note the number has variable length
Using just t-sql string functions you can try:
create table t(col varchar(50))
insert into t select '<div>#FIRST#12345#</div>'
insert into t select '<div>#THIRD#543#</div>'
insert into t select '<div>#SECOND#3690123#</div>'
select col,
case when p1.v=0 or p2.v <= p1.v then ''
else Substring(col, p1.v, p2.v-p1.v)
end ExtractedNumber
from t
cross apply(values(CharIndex('#',col,7) + 1))p1(v)
cross apply(values(CharIndex('#',col, p1.v + 1)))p2(v)
Output:
Caveat, this doesn't handle any "edge" cases and assumes data is as described.
Shooting from the hip due to a missing minimal reproducible example.
Assuming that it is XML data type column.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, xmldata XML);
INSERT INTO #tbl (xmldata) VALUES
('<div>#FIRST#12345#</div>'),
('<div>#FIRST#770770#</div>');
-- DDL and sample data population, end
SELECT t.*
, LEFT(x, CHARINDEX('#', x) - 1) AS Result
FROM #tbl t
CROSS APPLY xmldata.nodes('/div/text()') AS t1(c)
CROSS APPLY (SELECT REPLACE(c.value('.', 'VARCHAR(100)'), '#FIRST#' ,'')) AS t2(x);
Output
+----+---------------------------+--------+
| ID | xmldata | Result |
+----+---------------------------+--------+
| 1 | <div>#FIRST#12345#</div> | 12345 |
| 2 | <div>#FIRST#770770#</div> | 770770 |
+----+---------------------------+--------+

Can I replace substrings in a formula stored in a string in SQL?

I need to replace values within a formula stored as a string in SQL.
Example formulas stored in a column:
'=AA+BB/DC'
'=-(AA+CC)'
'=AA/BB+DD'
I have values for AA, BB etc. stored in another table.
Can I find and replace 'AA', 'BB' and so forth from within the formulas with numeric values to actually calculate the formula?
I assume I also need to replace the arithmetic operators ('+' , '/') from strings to actual signs, and if so is there a way to do it?
Desired Result
Assuming: AA = 10, BB = 20, DC = 5
I would need
'=AA+BB/DC' converted to 10+20/5 and a final output of 14
Please note that formulas can change in the future so I would need something resilient to that.
Thank you!
Okay, so this is a real hack, but I was intrigued by your question. You could turn my example into a function and then refactor it to your specific needs.
Note: using TRANSLATE requires SQL Server 2017. This could be a deal-breaker for you right there. TRANSLATE simplifies the replacement process greatly.
This example is just that--an example. A hack. Performance issues are unknown. You still need to do your diligence with testing.
-- Create a mock-up of the values table/data.
DECLARE #Values TABLE ( [key] VARCHAR(2), [val] INT );
INSERT INTO #Values ( [key], [val] ) VALUES
( 'AA', 10 ), ( 'BB', 20 ), ( 'CC', 6 ), ( 'DC', 5 );
-- Variable passed in to function.
DECLARE #formula VARCHAR(255) = '=(AA+BB)/DC';
-- Remove unnecessary mathmatical characters from the formula values.
DECLARE #vals VARCHAR(255) = REPLACE ( TRANSLATE ( #formula, '=()', '___' ), '_', '' );
-- Remove any leading mathmatical operations from #vals.
WHILE PATINDEX ( '[A-Z]', LEFT ( #vals, 1 ) ) = 0
SET #vals = SUBSTRING ( #vals, 2, LEN ( #vals ) );
-- Use SQL hack to replace placeholder values with actual values...
SELECT #formula = REPLACE ( #formula, fx.key_val, v.val )
FROM (
SELECT
[value] AS [key_val],
ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) AS [key_id]
FROM STRING_SPLIT ( TRANSLATE ( #vals, '+/*-', ',,,,' ), ',' )
) AS fx
INNER JOIN #Values v
ON Fx.[key_val] = v.[key]
ORDER BY
fx.[key_id]
-- Return updated formula.
SELECT #formula AS RevisedFormula;
-- Return the result (remove the equals sign).
SET #formula = FORMATMESSAGE ( 'SELECT %s AS FormulaResult;', REPLACE ( #formula, '=', '' ) );
EXEC ( #formula );
SELECT #formula AS RevisedFormula; returns:
+----------------+
| RevisedFormula |
+----------------+
| =(10+20)/5 |
+----------------+
The last part of my example uses EXEC to do the math. You cannot use EXEC in a function.
-- Return the result (remove the equals sign).
SET #formula = FORMATMESSAGE ( 'SELECT %s AS FormulaResult;', REPLACE ( #formula, '=', '' ) );
EXEC ( #formula );
Returns
+---------------+
| FormulaResult |
+---------------+
| 6 |
+---------------+
Changing the formula value to =-(AA+CC) returns:
+----------------+
| RevisedFormula |
+----------------+
| =-(10+6) |
+----------------+
+---------------+
| FormulaResult |
+---------------+
| -16 |
+---------------+
It's probably worth noting to pay attention to math order in your formulas. Your original example of =AA+BB/DC returns 14, not the 6 that may have been expected. I updated your formula to =(AA+BB)/DC for my example.

SQL: How to split column by character count [duplicate]

This question already has answers here:
Split column into multiple columns based on character count
(3 answers)
Closed last year.
I have one column with letters. I want to split this column into chunks of three. What SQL code for Microsoft would I need? I have read 'split my a special character' but I am not sure how to create a split by value where the split is not restricted to number of columns either.
You can do :
select t.*, substring(col, 1, 3), substring(col, 4, 3), substring(col, 7, 3)
from table t
If you really want to do this dynamically, as stated in the question, and have a query that creates just as many columns as needed, then you do need dynamic SQL.
Here is a solution that uses a recusive CTE to generate the query string.
declare #sql nvarchar(max);
with cte as (
select
1 pos,
cast('substring(code, 1, 3) col1' as nvarchar(max)) q,
max(len(code)) max_pos from mytable
union all
select
pos + 1,
cast(
q
+ ', substring(code, ' + cast(pos * 3 + 1 as nvarchar(3))
+ ', 3) col'
+ cast(pos + 1 as nvarchar(3))
as nvarchar(max)),
max_pos
from cte
where pos < max_pos / 3
)
select #sql = N'select ' + q + ' from mytable'
from cte
where len(q) = (select max(len(q)) from cte);
select #sql sql;
EXEC sp_executesql #sql;
The anchor of the recursive query computes the length of the longest string in column code. Then, the recursive part generates a series of substring() expressions for each chunk of 3 characters, with dynamic column names like col1, col2 and so on. You can then (debug and) execute that query string.
Demo on DB Fiddle:
-- debug
| sql |
| :---------------------------------------------------------------------------------------------------------------------------------- |
| select substring(code, 1, 3) col1, substring(code, 4, 3) col2, substring(code, 7, 3) col3, substring(code, 10, 3) col4 from mytable |
-- results
col1 | col2 | col3 | col4
:--- | :--- | :--- | :---
ABC | DEF | GHI |
XYZ | ABC | |
JKL | MNO | PQR | STU
ABC | DEF | |
Try it like this, which does not need any generic SQL (as long as you can specify a maximum count of columns):
First we need to define a mockup scenario to simulate your issue
DECLARE #tbl TABLE(ID INT IDENTITY, YourString VARCHAR(100));
INSERT INTO #tbl VALUES ('AB')
,('ABC')
,('ABCDEFGHI')
,('XYZABC')
,('JKLMNOPQRSTU')
,('ABCDEF');
--We can set the chunk length generically. Try it with other values...
DECLARE #ChunkLength INT=3;
--The query
SELECT p.*
FROM
(
SELECT t.ID
,CONCAT('Col',A.Nmbr) AS ColumnName
,SUBSTRING(t.YourString,(A.Nmbr-1)*#ChunkLength + 1,#ChunkLength) AS Chunk
FROM #tbl t
CROSS APPLY
(
SELECT TOP((LEN(t.YourString)+(#ChunkLength-1))/#ChunkLength) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values
) A(Nmbr)
) src
PIVOT
(
MAX(Chunk) FOR ColumnName IN(Col1,Col2,Col3,Col4,Col5,Col6 /*add the maximum column count here*/)
) p;
The idea in short:
By using an APPLY call we can create a row-wise tally. This will return multiple rows per input string. The row count is defined by the computed TOP-clause.
We use the row-wise tally first to create a column Name and second as parameters in SUBSTRING().
Finally we can use PIVOT to return this as horizontal list.
One hint about generic result sets:
This might be kind of religion, but - at least in my point of view - I would prefer a fix resultset with a lot of empty columns, rather than a generically defined set. The consumer should know the result format in advance...
You might use exactly the same query as dynamically created SQL statement. The only thing you would need to change is the actual list of column names in the PIVOT's IN-clause.

Convert Data in a Column to a row in SQL Server

Fairly new to SQL, so I do apologise!
Currently I have the following SQL Query:
select [data]
from Database1.dbo.tbl_Data d
join Database1.tbl_outbound o on d.session_id = o.session_id
where o.campaign_id = 1047
and d.session_id = 12
This returns ONE column which looks like this (and it can return different number of rows, depending on campaign_id and session_id!):
[data]
[1] Entry 1
[2] Entry 2
[3] Entry 3
[4] Entry 4
[5] Entry 5
.....
[98] Entry 98
[99] Entry 99
I would like to convert the data so they are displayed in 1 row and not 1 column, for example:
[data1] [data2] [data3] [data4] [data5] .... [data98] [data99]
[1] Entry 1 Entry 2 Entry 3 Entry 4 Entry 5 .... Entry 98 Entry 99
I hope I have explained that well enough! Thanks! :)
I have seen some information floating around about Pivot and Unpivot, but couldn't get it to play ball!
Try This Dynamic sql which helps your requirement
IF OBJECT_ID('tempdb..#Temp')IS NOT NULL
DROP TABLE #Temp
CREATE TABLE #Temp (data VARCHAR(100))
GO
IF OBJECT_ID('tempdb..#FormatedTable')IS NOT NULL
DROP TABLE #FormatedTable
Go
INSERT INTO #Temp(data)
SELECT 'Entry1' UNION ALL
SELECT 'Entry2' UNION ALL
SELECT 'Entry3' UNION ALL
SELECT 'Entry4' UNION ALL
SELECT 'Entry5'
SELECT ROW_NUMBER()OVER(ORDER BY Data) AS SeqId,
Data,
'Data'+CAST(ROW_NUMBER()OVER(ORDER BY Data) AS VARCHAR(100)) AS ReqColumn
INTO #FormatedTable
FROM #Temp
DECLARE #Sql nvarchar(max),
#DynamicColumn nvarchar(max),
#MaxDynamicColumn nvarchar(max)
SELECT #DynamicColumn = STUFF((SELECT ', '+QUOTENAME(ReqColumn)
FROM #FormatedTable FOR XML PATH ('')),1,1,'')
SELECT #MaxDynamicColumn = STUFF((SELECT ', '+'MAX('+(ReqColumn)+') AS '+QUOTENAME(CAST(ReqColumn AS VARCHAR(100)))
FROM #FormatedTable FOR XML PATH ('')),1,1,'')
SET #Sql=' SELECT ROW_NUMBER()OVER(ORDER BY (SELECT 1)) AS SeqId, '+ #MaxDynamicColumn+'
FROM
(
SELECT * FROM #FormatedTable
) AS src
PIVOT
(
MAX(Data) FOR [ReqColumn] IN ('+#DynamicColumn+')
) AS Pvt
'
EXEC (#Sql)
PRINT #Sql
Result
SeqId Data1 Data2 Data3 Data4 Data5
----------------------------------------------
1 Entry1 Entry2 Entry3 Entry4 Entry5
There is no really simple way. You can use pivot or conditional aggregation. I prefer the latter:
select max(case when left(data, 3) = '[1]' then data end) as data_001,
max(case when left(data, 3) = '[2]' then data end) as data_002,
max(case when left(data, 5) = '[100]' then data end) as data_100
from Database1.dbo.tbl_Data d join
Database1.tbl_outbound o
on d.session_id = o.session_id
where o.campaign_id = 1047 and d.session_id = 12;
Note that the columns are fixed, so you will always have 100 columns, regardless of the number of actual values in the data.
If you need a flexible number of columns, then you need dynamic pivoting, which requires constructing the query as a string and then executing the string.
The easiest way to do that is to utilize SQLCLR.
Check out the solution and explanation on An Easier Way of Transposing Query Result in SQL Server

SQL Server Management Studio - UNPIVOT Query for n columns, n rows

I wanted to unpivot a dataset which looks like this:
To this:
+-------------+-----------------+---------+
| Scenario ID | Distribution ID | Value |
+-------------+-----------------+---------+
| 0 | Number1 | 10 |
| 0 | Number2 | 19 |
| 0 | Number3 | 34.3 |
| 0 | Number4 | 60.31 |
| 0 | Number5 | 104.527 |
+-------------+-----------------+---------+
Using SQL System Management Studio.
I think I should use a code which is based on something like this:
SELECT *
FROM
(
SELECT 1
FROM table_name
) AS cp
UNPIVOT
(
Scenario FOR Scenarios IN (*
) AS up;
Can anyone help me with this? I do not know how to code, just starting.
Thanks in advance!
In case you need a dynamic unpivot solution (that can handle any number of columns) try this:
create table [dbo].[Test] ([ScenarioID] int, [Number1] decimal(10,3),
[Number2] decimal(10,3), [Number3] decimal(10,3),
[Number4] decimal(10,3), [Number5] decimal(10,3))
insert into [dbo].[Test] select 0, 10, 19, 34.3, 60.31, 104.527
declare #sql nvarchar(max) = ''
declare #cols nvarchar(max) = ''
select #cols = #cols +','+ QUOTENAME(COLUMN_NAME)
from INFORMATION_SCHEMA.COLUMNS
where TABLE_SCHEMA='dbo' and TABLE_NAME='test' and COLUMN_NAME like 'Number%'
order by ORDINAL_POSITION
set #cols = substring(#cols, 2, LEN(#cols))
set #sql = #sql + ' select u.[ScenarioID], u.[DistributionID], u.[Value]
from [dbo].[Test] s
unpivot
(
[Value]
for [DistributionID] in ('+ #cols + ')
) u;'
execute(#sql)
Result:
I would use apply :
select t.scenarioid, tt.distributionId, tt.value
from table t cross apply
( values (Number1, 'Number1'), (Number2, 'Number2'), . . .
) tt (value, distributionId);
Yes, you need to list out all possible Numbers first time only.
You could use VALUES:
SELECT T.scenarioId, s.*
FROM tab t
CROSS APPLY (VALUES ('Number1', t.Number1),
('Number2', t.Number2)) AS s(DistId, Val)
I use cross apply for this:
select t.scenarioid, v.*
from t cross apply
(values ('Number1', number1), ('Number2', number2), . . .
) v(distributionId, number);
You need to list out all the numbers.
Why do I prefer cross apply over unpivot? I find the unpivot syntax to be very specific. It pretty much does exactly one thing. On the other hand, apply introduces lateral joins. These are very powerful, and apply can be used in many different situations.