Split a column in T-SQL

Split a column in T-SQL - sql

I'm new to using T-SQL,
how do you separate an NVARCHAR column with the following information
[3293,"Maria","CA","Auto"]
[67093,"Joana","WA","Manual"]
I would like to get 4 columns like this
col1 col2 col3 col4
3293 Maria CA Auto
67093 Joana WA Manual
Thanks

Without the need for an aggregation.
Example
Select Col1 = JSON_VALUE([SomeCol],'$[0]')
,Col2 = JSON_VALUE([SomeCol],'$[1]')
,Col3 = JSON_VALUE([SomeCol],'$[2]')
,Col4 = JSON_VALUE([SomeCol],'$[3]')
From YourTable A
Results
Col1 Col2 Col3 Col4
3293 Maria CA Auto
67093 Joana WA Manual

You can use openjson and aggregate:
select
max(case when [key] = 0 then value end) col1,
max(case when [key] = 1 then value end) col2,
max(case when [key] = 2 then value end) col3,
max(case when [key] = 3 then value end) col4
from OpenJson('[3293,"Maria","CA","Auto"]')

One more suggestion uses a trick to stuff a json array into another array. This allows for a type-safe(!) and pivot/aggregate-free WITH-clause:
Declare a dummy table to provide a showcase (please do this yourself in your next question).
DECLARE #dummyTable TABLE(ID INT IDENTITY, YourJson NVARCHAR(MAX));
INSERT INTO #dummyTable(YourJson) VALUES
('[3293,"Maria","CA","Auto"]')
,('[67093,"Joana","WA","Manual"]');
--the query
SELECT t.ID
,JsonValues.*
FROM #dummyTable t
CROSS APPLY OPENJSON(CONCAT('[',t.YourJson,']'))
WITH
(
TheNumber int '$[0]'
,Firstname nvarchar(100) '$[1]'
,[State] nvarchar(100) '$[2]'
,[Type] nvarchar(100) '$[3]'
) JsonValues;
The idea in short:
Using CONCAT() we add a [ and a ] around your array.
Now we can use WITH specifying the resulting column with name, type and the json path to grab it.
The result:
+----+-----------+-----------+-------+--------+
| ID | TheNumber | Firstname | State | Type |
+----+-----------+-----------+-------+--------+
| 1 | 3293 | Maria | CA | Auto |
+----+-----------+-----------+-------+--------+
| 2 | 67093 | Joana | WA | Manual |
+----+-----------+-----------+-------+--------+

Use 'OpenJson' and pass the column values in a loop.
--Step1: Create a Temporary table and add Row_Number
select ROW_NUMBER() over( order by COL) as r,*
INTO #Temp_table
from YourTable;
--Step2: Declare and set Variables
DECLARE #count INT;
DECLARE #row INT;
DECLARE #JSON NVARCHAR(250);
set #count= (select COUNT(1) FROM #Temp_table);
SET #row = 1;
--Step3: Create Final table (Here, using temp final table)
CREATE TABLE #TEMP_FINAL
(COL1 INT,COL2 VARCHAR(100),COL3 VARCHAR(100),COL4 VARCHAR(100));
--Step4: Iterate over loop
WHILE (#row <= #count) BEGIN
SELECT #JSON=COL FROM #Temp_table WHERE #row=r;
INSERT INTO #TEMP_FINAL
select
max(case when [key] = 0 then value end) col1,
max(case when [key] = 1 then value end) col2,
max(case when [key] = 2 then value end) col3,
max(case when [key] = 3 then value end) col4
from OpenJson(#JSON);
SET #row += 1;
END
--Step5: Select the values
SELECT * FROM #TEMP_FINAL
To have understanding, you can review following links
Row_Number: https://www.sqlservertutorial.net/sql-server-window-functions/sql-server-row_number-function/
While Loop: https://www.sqlshack.com/sql-while-loop-with-simple-examples/

Related

String Split Ignore Last delimiter if no data

I am string splitting some values that are comma delimited into rows.
However some values have an extra comma on the end.
Example
Userid | Value
1 | A,B,C,D,
2 | F,H
Code
select value
from string_split('A,B,C,D,',',')
Current Output
UserId | Value
1 | A
1 | B
1 | C
1 | D
1 |
Is there any way to make the string split function ignore the final comma if no data follows it?
Desired Output
UserId | Value
1 | A
1 | B
1 | C
1 | D
Using MSSQL

Just add "WHERE" sentence like this:
select value
from string_split('A,B,C,D,',',')
WHERE value <> ''

STRING_SPLIT Function doesn't support for lower version of sql server so first create a function to split the given string and join the function with your select query.Here is below sample for your expected result.
Created User defined Function
CREATE FUNCTION [dbo].[Udf_StringSplit]
(
#Userid INT,
#Value VARCHAR(1000)
)
RETURNS #Result TABLE(
Userid INT,
Value VARCHAR(10)
)
AS BEGIN
DECLARE #Data AS TABLE
(
Userid INT,
Value VARCHAR(100)
)
INSERT INTO #Data(Userid,Value)
SELECT #Userid, #Value
INSERT INTO #Result(Userid,Value)
SELECT Userid,
Split.a.value('.','nvarchar(1000)') AS Value
FROM
(
SELECT Userid,
CAST('<S>'+REPLACE(#Value,',','</S><S>')+'</S>' AS XML) Value
FROM #Data
) AS A
CROSS APPLY Value.nodes('S') AS Split(a)
WHERE Userid=#Userid AND Split.a.value('.','nvarchar(1000)') <>''
RETURN
END
GO
Sample data table
DECLARE #Data AS TABLE(Userid INT , Value VARCHAR(100))
INSERT INTO #Data
SELECT 1,'A,B,C,D,' UNION ALL
SELECT 2,'F,H'
Sql script to get the expected result
SELECT d.Userid,
f.Value
FROM #Data d
CROSS APPLY [dbo].[Udf_StringSplit] (d.Userid,d.Value) AS f
WHERE d.Userid=1
GO
Result
Userid Value
------------
1 A
1 B
1 C
1 D

T-SQL: Efficient way to add up column values

Now I'm sure this has been asked and superbly been answered on here. However, I am unable to find the answer since it touches many keywords.
I basically want to replace a table of the form:
Type amount param note
7 2 str1 NULL
42 12 str2 NULL
128 7 str3 samplenote
42 12 NULL NULL
101 4 str4 NULL
42 12 NULL NULL
7 1 str1 samplenote
128 2 str5 NULL
with a table like:
Type amount param note
7 3 str1 combined
42 36 NULL combined
128 9 NULL combined
101 4 str4 combined
In words, I seek to sum up the amount parameter based on its type while declaring param = NULL for all "unclear" fields. (param should be NULL when the param values of combined Types have more than one different content; else, param should have the original content.)
With my python background, I tackled this task with a for loop approach, iterating through the types, adding a new row for every type with summed up amount and note = 'combined', to then delete the remaining rows (see below). There has to be a more efficient way with some JOIN statement I'm sure. But how would that look like?
FYI, this is the solution I am working on (not functioning yet!):
/*** dbo.sourcetable holds all possible Type values ***/
CREATE PROCEDURE [sumup]
AS
BEGIN
DECLARE #i int = (SELECT TOP (1) Type FROM [dbo].[sourcetable] ORDER BY Type)
DECLARE #MaxType int = (SELECT TOP (1) Type FROM [dbo].[sourcetable] ORDER BY Type DESC)
DECLARE #sum int
BEGIN TRY
WHILE #i <= #MaxType
BEGIN
IF EXISTS (SELECT * FROM [dbo].[worktable] WHERE Type = #i)
BEGIN
SET #sum = (SELECT SUM(amount) FROM [dbo].[worktable] WHERE Type = #i)
BEGIN
WITH cte AS (SELECT * FROM [dbo].[worktable] WHERE Type = #i)
INSERT INTO [dbo].[worktable]
([Type]
,[amount]
,[param]
,[note]
SELECT
cte.Type
,#sum
,cte.param
,'combined'
FROM cte
END
DELETE FROM [dbo].[worktable] WHERE Type = #i AND ISNULL([note],'') <> 'combined'
END
SET #i = #i + 1
END
END TRY
BEGIN CATCH
-- some errorlogging code
END CATCH
END
GO

This can be achieved with a single select statement.
If you require your combined flag to only apply to where more than one row has been combined, add another case expression checking the result of either a count(1) for rows combined or count(distinct param) for unique param values combined:
declare #t as table(type int, amount int, param varchar(15), note varchar(15));
insert into #t values (7,2,'str1',NULL),(42,12,'str2',NULL),(128,7,'str3','samplenote'),(42,12,NULL,NULL),(101,4,'str4',NULL),(42,12,NULL,NULL),(7,1,'str1','samplenote'),(128,2,'str5',NULL);
select type
,sum(amount) as amount
,case when count(distinct isnull(param,'')) = 1
then max(param)
else null
end as param
,'combined' as note
from #t
group by type
order by type;
Output:
+------+--------+-------+----------+
| type | amount | param | note |
+------+--------+-------+----------+
| 7 | 3 | str1 | combined |
| 42 | 36 | NULL | combined |
| 101 | 4 | str4 | combined |
| 128 | 9 | NULL | combined |
+------+--------+-------+----------+

I am doing this way from keyboard, but this may work or be close to what you want
Select type , amount , iif( dc=1,p,null) param, 'combined' note
from
(
Select type, sum(amount) amount,
count(distinct Param) dc,max(Param) p
From ....
Group by type
) x

Here is a possible solution:
declare #tbl as table (
type int
,amount int
,param varchar(15)
,note varchar(15)
)
insert into #tbl values (7,2,'str1',NULL)
insert into #tbl values (42,12,'str2',NULL)
insert into #tbl values (128,7,'str3','samplenote')
insert into #tbl values (42,12,NULL,NULL)
insert into #tbl values (101,4,'str4',NULL)
insert into #tbl values (42,12,NULL,NULL)
insert into #tbl values (7,1,'str1','samplenote')
insert into #tbl values (128,2,'str5',NULL)
;WITH CTE AS (
SELECT
type
,SUM(AMOUNT) AS amount
,COUNT(DISTINCT ISNULL(param, 'dummy value')) AS ParamNo
,MAX(Param) AS Param
FROM #tbl
GROUP BY type
) SELECT
type
,amount
,CASE WHEN ParamNo = 1 THEN Param ELSE NULL END AS Param
,'combined' AS note
FROM CTE

This should work:
Select Type, sum(amount) as amount, count(distinct param)
, case when count(distinct param) = 1 then max(param) end as param,
'Combined' as note
From
mytable
Group By Type

Subtract two columns(Col1-Col2) and add remaining to next row in Col1

I'm trying to work on a logic which I cannot seem to figure out a way to do it.
Problem:
I have three columns PrimaryKey, COl1 and COl2 as shown in below screenshot
Let's take a new column Col3 = Col1-Col2,
I am adding the remaining in Col3 to Col1 of next row and again subtract it to get Col3.
Let us consider the table above and for
PrimaryKey=1 --> Col3 = 10.2 - 5 = 5.2.
This 5.2 must be added to Col1 of PrimaryKey=2 which is
15 + 5.2 = 20.2.
Now again Col3 = 20.2 - 3 = 17.2, like this it has to iterate for next records.
I hope that I am clear enough in explaining my issue. Please let me know if you need any further explanation.
*The table provided is just a sample table, The actual table that I am working is very large.
Thank you.

As far as I can tell, you want the cumulative value of col2 subtracted from the cumulative of col1. In SQL Server 2012+, you would do:
select t.*,
sum(col2 - col1) over (order by primary key)
from t;

You can use a self-join on the original table, but there's a danger if the primary key isn't complete, such as from deleting records at some point. Here I use a self-join with ROW_NUMBER to create a definitely sequential key - this allows me to join [this row number] to [this row number + 1] (sample data included in case anyone else has a better method):
DECLARE #YourTable TABLE (PrimaryKey INT IDENTITY(1,1), COL1 FLOAT, COL2 FLOAT)
INSERT INTO #YourTable
( COL1, COL2 )
VALUES
(10.2, 5.0)
, (15.0, 3.0)
, (5.7, 6)
, (9.0, 5.5)
; WITH TableRanked
AS (
SELECT PrimaryKey ,
COL1 ,
COL2
, COL3 = COL1 - COL2
, RowNum = ROW_NUMBER() OVER(ORDER BY PrimaryKey)
FROM #YourTable
)
SELECT tr.PrimaryKey ,
tr.COL1 ,
tr.COL2 ,
tr.COL3
, COL1Next = COALESCE(trNext.COL1, 0)
, COL4 = tr.COL3 + COALESCE(trNext.COL1, 0)
FROM TableRanked tr
LEFT JOIN TableRanked trNext
-- Here's where the magic happens:
ON trNext.RowNum = (tr.RowNum + 1)

If you are using SQL Server 2012 you can use LEAD. This will get you the same results as what Russel posted but with better performance:
DECLARE #YourTable TABLE (PrimaryKey INT IDENTITY(1,1), COL1 FLOAT, COL2 FLOAT)
INSERT INTO #YourTable (COL1, COL2 )
VALUES (10.2, 5.0), (15.0, 3.0), (5.7, 6), (9.0, 5.5);
SELECT *, (COL1 - COL2) + LEAD(COL1,1,0) OVER (ORDER BY PrimaryKey)
FROM #YourTable;

Let me know if this query helps:
DECLARE #n int
SELECT #n = count(*) from #yourTable
DECLARE #counter int
SELECT #counter = MIN(PrimaryKeyCol) from #yourTable
DECLARE #col3 = 0
WHILE (#counter <= #n)
SELECT #col3 = #col3 + Col1 - Col2 from #yourTable where PrimaryKeyCol = #counter
Print #col3
#counter = #counter + 1
END

Query improvement in where clause

I have the following query:
DECLARE #MyTable TABLE
(
[ID] INT ,
[Col1] INT ,
[Col2] INT
)
INSERT INTO #MyTable
SELECT 1 ,
2 ,
1
UNION
SELECT 1 ,
2 ,
3
UNION
SELECT 2 ,
2 ,
3
UNION
SELECT 2 ,
2 ,
3
UNION
SELECT 3 ,
2 ,
3
UNION
SELECT 3 ,
2 ,
1
DECLARE #ID INT
SET #ID = 1
SELECT *
FROM #MyTable
WHERE ( Col1 = ( CASE WHEN [ID] = #ID THEN 2
END )
OR [Col2] = ( CASE WHEN [ID] != #ID THEN 1
END )
)
WHEN [ID] = #ID I want to match Col1 with constant value equals to 2 and when [ID] != #ID I want to match Col2 with constant value equals to 1. Can the above query be improve so that [ID] equality check can be done only once in the above query, something like this:
SELECT *
FROM #MyTable
WHERE
if([ID] = #ID)
Col1=2
ELSE
[Col2]=1

Is this the logic that you want?
where id = #id and col1 = 2 or
id <> #id and col2 = 1
I don't know why you are concerned about the performance of such a clause. You can do what you want with a case statement:
where 1 = (case when id = #id
then (case when col1 = 2 then 1 end)
else col2 = 1
end)
But this is a needless "optimization". It is not even clear that the nested case statements would be any faster than the first version. Such simple operations are really, really fast on modern computers. And, what slows databases down is the processing of large volumes of data (in general), not such simple operations.

Perhaps just:
Select *
From #MyTable
Where ((id = #id and col1 = 2) or (id <> #id and col2 = 1))

Sequential numbers randomly selected and added to table

The SO Question has lead me to the following question.
If a table has 16 rows I'd like to add a field to the table with the numbers 1,2,3,4,5,...,16 arranged randomly i.e in the 'RndVal' field for row 1 this could be 2, then for row 2 it could be 5 i.e each of the 16 integers needs to appear once without repetition.
Why doesn't the following work? Ideally I'd like to see this working then to see alternative solutions.
This creates the table ok:
IF OBJECT_ID('tempdb..#A') IS NOT NULL BEGIN DROP TABLE #A END
IF OBJECT_ID('tempdb..#B') IS NOT NULL BEGIN DROP TABLE #B END
IF OBJECT_ID('tempdb..#C') IS NOT NULL BEGIN DROP TABLE #C END
IF OBJECT_ID('tempdb..#myTable') IS NOT NULL BEGIN DROP TABLE #myTable END
CREATE TABLE #B (B_ID INT)
CREATE TABLE #C (C_ID INT)
INSERT INTO #B(B_ID) VALUES
(10),
(20),
(30),
(40)
INSERT INTO #C(C_ID)VALUES
(1),
(2),
(3),
(4)
CREATE TABLE #A
(
B_ID INT
, C_ID INT
, RndVal INT
)
INSERT INTO #A(B_ID, C_ID, RndVal)
SELECT
#B.B_ID
, #C.C_ID
, 0
FROM #B CROSS JOIN #C;
Then I'm attempting to add the random column using the following. The logic is to add random numbers between 1 and 16 > then to effectively overwrite any that are duplicated with other numbers > in a loop ...
SELECT
ROW_NUMBER() OVER(ORDER BY B_ID) AS Row
, B_ID
, C_ID
, RndVal
INTO #myTable
FROM #A
DECLARE #rowsRequired INT = (SELECT COUNT(*) CNT FROM #myTable)
DECLARE #i INT = (SELECT #rowsRequired - SUM(CASE WHEN RndVal > 0 THEN 1 ELSE 0 END) FROM #myTable)--0
DECLARE #end INT = 1
WHILE #end > 0
BEGIN
SELECT #i = #rowsRequired - SUM(CASE WHEN RndVal > 0 THEN 1 ELSE 0 END) FROM #myTable
WHILE #i>0
BEGIN
UPDATE x
SET x.RndVal = FLOOR(RAND()*#rowsRequired)
FROM #myTable x
WHERE x.RndVal = 0
SET #i = #i-1
END
--this is to remove possible duplicates
UPDATE c
SET c.RndVal = 0
FROM
#myTable c
INNER JOIN
(
SELECT RndVal
FROM #myTable
GROUP BY RndVal
HAVING COUNT(RndVal)>1
) t
ON
c.RndVal = t.RndVal
SET #end = ##ROWCOUNT
END
TRUNCATE TABLE #A
INSERT INTO #A
SELECT
B_ID
, C_ID
, RndVal
FROM #myTable
If the original table has 6 rows then the result should end up something like this
B_ID|C_ID|RndVal
----------------
| | 5
| | 4
| | 1
| | 6
| | 3
| | 2

I don't understand your code, frankly
This will update each row with a random number, non-repeated number between 1 and the number of rows in the table
UPDATE T
SET SomeCol = T2.X
FROM
MyTable T
JOIN
(
SELECT
KeyCol, ROW_NUMBER() OVER (ORDER BY NEWID()) AS X
FROM
MyTable
) T2 ON T.KeyCol = T2.KeyCol
This is more concise but can't test to see if it works as expected
UPDATE T
SET SomeCol = X
FROM
(
SELECT
SomeCol, ROW_NUMBER() OVER (ORDER BY NEWID()) AS X
FROM
MyTable
) T

When you add TOP (1) (because you need to update first RndVal=0 record) and +1 (because otherwise your zero mark means nothing) to your update, things will start to move. But extremely slowly (around 40 seconds on my rather outdated laptop). This is because, as #myTable gets filled with generated random numbers, it becomes less and less probable to get missing numbers - you usually get duplicate, and have to start again.
UPDATE top (1) x
SET x.RndVal = FLOOR(RAND()*#rowsRequired) + 1
FROM #myTable x
WHERE x.RndVal = 0
Of course, #gbn has perfectly valid solution.

This is basically the same as the previous answer, but specific to your code:
;WITH CTE As
(
SELECT B_ID, C_ID, RndVal,
ROW_NUMBER() OVER(ORDER BY NewID()) As NewOrder
FROM #A
)
UPDATE CTE
SET RndVal = NewOrder
SELECT * FROM #A ORDER BY RndVal

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Split a column in T-SQL - sql

I'm new to using T-SQL, how do you separate an NVARCHAR column with the following information [3293,"Maria","CA","Auto"] [67093,"Joana","WA","Manual"] I would like to get 4 columns like this col1 col2 col3 col4 3293 Maria CA Auto 67093 Joana WA Manual Thanks

Without the need for an aggregation. Example Select Col1 = JSON_VALUE([SomeCol],'$[0]') ,Col2 = JSON_VALUE([SomeCol],'$[1]') ,Col3 = JSON_VALUE([SomeCol],'$[2]') ,Col4 = JSON_VALUE([SomeCol],'$[3]') From YourTable A Results Col1 Col2 Col3 Col4 3293 Maria CA Auto 67093 Joana WA Manual

You can use openjson and aggregate: select max(case when [key] = 0 then value end) col1, max(case when [key] = 1 then value end) col2, max(case when [key] = 2 then value end) col3, max(case when [key] = 3 then value end) col4 from OpenJson('[3293,"Maria","CA","Auto"]')

Related

String Split Ignore Last delimiter if no data

T-SQL: Efficient way to add up column values

Subtract two columns(Col1-Col2) and add remaining to next row in Col1

Query improvement in where clause

Sequential numbers randomly selected and added to table

Categories

Resources