'Merge Fields' - alike SQL Server function - sql

I try to find a way to let the SGBD perform a population of merge fields within a long text.
Create the structure :
CREATE TABLE [dbo].[store]
(
[id] [int] NOT NULL,
[text] [nvarchar](MAX) NOT NULL
)
CREATE TABLE [dbo].[statement]
(
[id] [int] NOT NULL,
[store_id] [int] NOT NULL
)
CREATE TABLE [dbo].[statement_merges]
(
[statement_id] [int] NOT NULL,
[merge_field] [nvarchar](30) NOT NULL,
[user_data] [nvarchar](MAX) NOT NULL
)
Now, create test values
INSERT INTO [store] (id, text)
VALUES (1, 'Waw, stackoverflow is an amazing library of lost people in the IT hell, and i have the feeling that $$PERC_SAT$$ of the users found a solution, personally I asked $$ASKED$$ questions.')
INSERT INTO [statement] (id, store_id)
VALUES (1, 1)
INSERT INTO [statement_merges] (statement_id, merge_field, user_data)
VALUES (1, '$$PERC_SAT$$', '85%')
INSERT INTO [statement_merges] (statement_id, merge_field, user_data)
VALUES (1, '$$ASKED$$', '12')
At the time being my app is delivering the final statement, looping through merges, replacing in the stored text and output
Waw, stackoverflow is an amazing library of lost people in the IT
hell, and i have the feeling that 85% of the users found a solution,
personally I asked 12 questions.
I try to find a way to be code-independent and serve the output in a single query, as u understood, select a statement in which the stored text have been populated with user data. I hope I'm clear.
I looked on TRANSLATE function but it looks like a char replacement, so I have two choices :
I try a recursive function, replacing one by one until no merge_fields is found in the calculated text; but I have doubts about the performance of this approach;
There is a magic to do that but I need your knowledge...
Consider that I want this because the real texts are very long, and I don't want to store it more than once in my database. You can imagine a 3 pages contract with only 12 parameters, like start date, invoiced amount, etc... Everything else cant be changed for compliance.
Thank you for your time!
EDIT :
Thanks to Randy's help, this looks to do the trick :
WITH cte_replace_tokens AS (
SELECT replace(r.text, m.merge_field, m.user_data) as [final], m.merge_field, s.id, 1 AS i
FROM store r
INNER JOIN statement s ON s.store_id = r.id
INNER JOIN statement_merges m ON m.statement_id = s.id
WHERE m.statement_id = 1
UNION ALL
SELECT replace(r.final, m.merge_field, m.user_data) as [final], m.merge_field, r.id, r.i + 1 AS i
FROM cte_replace_tokens r
INNER JOIN statement_merges m ON m.statement_id = r.id
WHERE m.merge_field > r.merge_field
)
select TOP 1 final from cte_replace_tokens ORDER BY i DESC
I will check with a bigger database if the performance is good...
At least, I can "populate" one statement, I need to figure out to be able to extract a list as well.
Thanks again !

If a record is updated more than once by the same update, the last wins. None of the updates are affected by the others - no cumulative effect. It is possible to trick SQL using a local variable to get cumulative effects in some cases, but it's tricky and not recommended. (Order becomes important and is not reliable in an update.)
One alternate is recursion in a CTE. Generate a new record from the prior as each token is replaced until there are no tokens. Here is a working example that replaces 1 with A, 2 with B, etc. (I wonder if there is some tricky xml that can do this as well.)
if not object_id('tempdb..#Raw') is null drop table #Raw
CREATE TABLE #Raw(
[test] [varchar](100) NOT NULL PRIMARY KEY CLUSTERED,
)
if not object_id('tempdb..#Token') is null drop table #Token
CREATE TABLE #Token(
[id] [int] NOT NULL PRIMARY KEY CLUSTERED,
[token] [char](1) NOT NULL,
[value] [char](1) NOT NULL,
)
insert into #Raw values('123456'), ('1122334456')
insert into #Token values(1, '1', 'A'), (2, '2', 'B'), (3, '3', 'C'), (4, '4', 'D'), (5, '5', 'E'), (6, '6', 'F');
WITH cte_replace_tokens AS (
SELECT r.test, replace(r.test, l.token, l.value) as [final], l.id
FROM [Raw] r
CROSS JOIN #Token l
WHERE l.id = 1
UNION ALL
SELECT r.test, replace(r.final, l.token, l.value) as [final], l.id
FROM cte_replace_tokens r
CROSS JOIN #Token l
WHERE l.id = r.id + 1
)
select * from cte_replace_tokens where id = 6

It's not recommended to do such tasks inside sql engine but if you want to do that, you need to do it in a loop using cursor in a function or stored procedure like so :
DECLARE #merge_field nvarchar(30)
, #user_data nvarchar(MAX)
, #statementid INT = 1
, #text varchar(MAX) = 'Waw, stackoverflow is an amazing library of lost people in the IT hell, and i have the feeling that $$PERC_SAT$$ of the users found a solution, personally I asked $$ASKED$$ questions.'
DECLARE merge_statements CURSOR FAST_FORWARD
FOR SELECT
sm.merge_field
, sm.user_data
FROM dbo.statement_merges AS sm
WHERE sm.statement_id = #statementid
OPEN merge_statements
FETCH NEXT FROM merge_statements
INTO #merge_field , #user_data
WHILE ##FETCH_STATUS = 0
BEGIN
set #text = REPLACE(#text , #merge_field, #user_data )
FETCH NEXT FROM merge_statements
INTO #merge_field , #user_data
END
CLOSE merge_statements
DEALLOCATE merge_statements
SELECT #text

Here is a recursive solution.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE [dbo].[store]
(
[id] [int] NOT NULL,
[text] [nvarchar](MAX) NOT NULL
)
CREATE TABLE [dbo].[statement]
(
[id] [int] NOT NULL,
[store_id] [int] NOT NULL
)
CREATE TABLE [dbo].[statement_merges]
(
[statement_id] [int] NOT NULL,
[merge_field] [nvarchar](30) NOT NULL,
[user_data] [nvarchar](MAX) NOT NULL
)
INSERT INTO store (id, text)
VALUES (1, '$$(*)$$, stackoverflow...$$PERC_SAT$$...$$ASKED$$ questions.')
INSERT INTO store (id, text)
VALUES (2, 'Use The #_#')
INSERT INTO statement (id, store_id) VALUES (1, 1)
INSERT INTO statement (id, store_id) VALUES (2, 2)
INSERT INTO statement_merges (statement_id, merge_field, user_data) VALUES (1, '$$PERC_SAT$$', '85%')
INSERT INTO statement_merges (statement_id, merge_field, user_data) VALUES (1, '$$ASKED$$', '12')
INSERT INTO statement_merges (statement_id, merge_field, user_data) VALUES (1, '$$(*)$$', 'Wow')
INSERT INTO statement_merges (statement_id, merge_field, user_data) VALUES (2, ' #_#', 'Flux!')
Query 1:
;WITH Normalized AS
(
SELECT
store_id=store.id,
store.text,
sm.merge_field,
sm.user_data,
RowNumber = ROW_NUMBER() OVER(PARTITION BY store.id,sm.statement_id ORDER BY merge_field),
statement_id = st.id
FROM
store store
INNER JOIN statement st ON st.store_id = store.id
INNER JOIN statement_merges sm ON sm.statement_id = st.id
)
, Recurse AS
(
SELECT
store_id, statement_id, old_text = text, merge_field,user_data, RowNumber,
Iteration=1,
new_text = REPLACE(text, merge_field, user_data)
FROM
Normalized
WHERE
RowNumber=1
UNION ALL
SELECT
n.store_id, n.statement_id, r.old_text, n.merge_field, n.user_data,
RowNumber=r.RowNumber+1,
Iteration=Iteration+1,
new_text = REPLACE(r.new_text, n.merge_field, n.user_data)
FROM
Normalized n
INNER JOIN Recurse r ON r.RowNumber = n.RowNumber AND r.statement_id = n.statement_id
)
,ReverseOnIteration AS
(
SELECT *,
ReverseIteration = ROW_NUMBER() OVER(PARTITION BY statement_id ORDER BY Iteration DESC)
FROM
Recurse
)
SELECT
store_id, statement_id, new_text, old_text
FROM
ReverseOnIteration
WHERE
ReverseIteration=1
Results:
| store_id | statement_id | new_text | old_text |
|----------|--------------|------------------------------------------|--------------------------------------------------------------|
| 1 | 1 | Wow, stackoverflow...85%...12 questions. | $$(*)$$, stackoverflow...$$PERC_SAT$$...$$ASKED$$ questions. |
| 2 | 2 | Use TheFlux! | Use The #_# |

With the help of Randy, I think I've achieved what I wanted to do !
Known the fact that my real case is a contract, in which there are several statements that may be :
free text
stored text without any merges
stored text with one or
several merges
this CTE does the job !
WITH cte_replace_tokens AS (
-- The initial query dont join on merges neither on store because can be a free text
SELECT COALESCE(r.text, s.part_text) AS [final], CAST('' AS NVARCHAR) AS merge_field, s.id, 1 AS i, s.contract_id
FROM statement s
LEFT JOIN store r ON s.store_id = r.id
UNION ALL
-- We loop till the last merge field, output contains iteration to be able to keep the last record ( all fields updated )
SELECT replace(r.final, m.merge_field, m.user_data) as [final], m.merge_field, r.id, r.i + 1 AS i, r.contract_id
FROM cte_replace_tokens r
INNER JOIN statement_merges m ON m.statement_id = r.id
WHERE m.merge_field > r.merge_field AND r.final LIKE '%' + m.merge_field + '%'
-- spare lost replacements by forcing only one merge_field per loop
AND NOT EXISTS( SELECT mm.statement_id FROM statement_merges mm WHERE mm.statement_id = m.statement_id AND mm.merge_field > r.merge_field AND mm.merge_field < m.merge_field)
)
select s.id,
(select top 1 final from cte_replace_tokens t WHERE t.contract_id = s.contract_id AND t.id = s.id ORDER BY i DESC) as res
FROM statement s
where contract_id = 1

If the CTE solution with a cross join is too slow, an alternate solution would be to build a scalar fn dynamically that has every REPLACE required from the token table. One scalar fn call per record then is order(N). I get the same result as before.
The function is simple and likely not to be too long, depending upon how big the token table becomes...256 MB batch limit. I've seen attempts to dynamically create queries to improve performance backfire - moved the problem to compile time. Should not be a problem here.
if not object_id('tempdb..#Raw') is null drop table #Raw
CREATE TABLE #Raw(
[test] [varchar](100) NOT NULL PRIMARY KEY CLUSTERED,
)
if not object_id('tempdb..#Token') is null drop table #Token
CREATE TABLE #Token(
[id] [int] NOT NULL PRIMARY KEY CLUSTERED,
[token] [char](1) NOT NULL,
[value] [char](1) NOT NULL,
)
insert into #Raw values('123456'), ('1122334456')
insert into #Token values(1, '1', 'A'), (2, '2', 'B'), (3, '3', 'C'), (4, '4', 'D'), (5, '5', 'E'), (6, '6', 'F');
DECLARE #sql varchar(max) = 'CREATE FUNCTION dbo.fn_ReplaceTokens(#raw varchar(8000)) RETURNS varchar(8000) AS BEGIN RETURN ';
WITH cte_replace_statement AS (
SELECT a.id, CAST('replace(#raw,''' + a.token + ''',''' + a.value + ''')' as varchar(max)) as [statement]
FROM #Token a
WHERE a.id = 1
UNION ALL
SELECT n.id, CAST(replace(l.[statement], '#raw', 'replace(#raw,''' + n.token + ''',''' + n.value + ''')') as varchar(max)) as [statement]
FROM #Token n
INNER JOIN cte_replace_statement l
ON n.id = l.id + 1
)
select #sql += [statement] + ' END' from cte_replace_statement where id = 6
print #sql
if not object_id('dbo.fn_ReplaceTokens') is null drop function dbo.fn_ReplaceTokens
execute (#sql)
SELECT r.test, dbo.fn_ReplaceTokens(r.test) as [final] FROM [Raw] r

Related

Want to compare 4 different columns with the result of CTE

I have created a CTE (common table Expression) as follows:
DECLARE #N VARCHAR(100)
WITH CAT_NAM AS (
SELECT ID, NAME
FROM TABLE1
WHERE YEAR(DATE) = YEAR(GETDATE())
)
SELECT #N = STUFF((
SELECT ','''+ NAME+''''
FROM CAT_NAM
WHERE ID IN (20,23,25,30,37)
FOR XML PATH ('')
),1,1,'')
The result of above CTE is 'A','B','C','D','F'
Now I need to check 4 different columns CAT_NAM_1,CAT_NAM_2,CAT_NAM_3,CAT_NAM_4 in the result of CTE and form it as one column like follow:
Select
case when CAT_NAM_1 in (#N) then CAT_NAM_1
when CAT_NAM_2 in (#N) then CAT_NAM_2
when CAT_NAM_3 in (#N) then CAT_NAM_3
when CAT_NAM_4 in (#N) then CAT_NAM_4
end as CAT
from table2
When I'm trying to do the above getting error please help me to do.
If my approach is wrong help me with right one.
I am not exactly sure what you are trying to do, but if I understand the following script shows one possible technique. I have created some table variables to mimic the data you presented and then wrote a SELECT statement to do what I think you asked (but I am not sure).
DECLARE #TABLE1 AS TABLE (
ID INT NOT NULL,
[NAME] VARCHAR(10) NOT NULL,
[DATE] DATE NOT NULL
);
INSERT INTO #TABLE1(ID,[NAME],[DATE])
VALUES (20, 'A', '2021-01-01'), (23, 'B', '2021-02-01'),
(25, 'C', '2021-03-01'),(30, 'D', '2021-04-01'),
(37, 'E', '2021-05-01'),(40, 'F', '2021-06-01');
DECLARE #TABLE2 AS TABLE (
ID INT NOT NULL,
CAT_NAM_1 VARCHAR(10) NULL,
CAT_NAM_2 VARCHAR(10) NULL,
CAT_NAM_3 VARCHAR(10) NULL,
CAT_NAM_4 VARCHAR(10) NULL
);
INSERT INTO #TABLE2(ID,CAT_NAM_1,CAT_NAM_2,CAT_NAM_3,CAT_NAM_4)
VALUES (1,'A',NULL,NULL,NULL),(2,NULL,'B',NULL,NULL);
;WITH CAT_NAM AS (
SELECT ID, [NAME]
FROM #TABLE1
WHERE YEAR([DATE]) = YEAR(GETDATE())
AND ID IN (20,23,25,30,37,40)
)
SELECT CASE
WHEN EXISTS(SELECT 1 FROM CAT_NAM WHERE CAT_NAM.[NAME] = CAT_NAM_1) THEN CAT_NAM_1
WHEN EXISTS(SELECT 1 FROM CAT_NAM WHERE CAT_NAM.[NAME] = CAT_NAM_2) THEN CAT_NAM_2
WHEN EXISTS(SELECT 1 FROM CAT_NAM WHERE CAT_NAM.[NAME] = CAT_NAM_3) THEN CAT_NAM_3
WHEN EXISTS(SELECT 1 FROM CAT_NAM WHERE CAT_NAM.[NAME] = CAT_NAM_4) THEN CAT_NAM_4
ELSE '?' -- not sure what you want if there is no match
END AS CAT
FROM #TABLE2;
You can do a bit of set-based logic for this
SELECT
ct.NAME
FROM table2 t2
CROSS APPLY (
SELECT v.NAME
FROM (VALUES
(t2.CAT_NAM_1),
(t2.CAT_NAM_2),
(t2.CAT_NAM_3),
(t2.CAT_NAM_4)
) v(NAME)
INTERSECT
SELECT ct.NAM
FROM CAT_NAM ct
WHERE ct.ID IN (20,23,25,30,37)
) ct;

Get parents based on child id SQL

I have the following scenario in a Microsoft SQL environment:
CREATE TABLE grps
(
[id] varchar(50),
[parentid] varchar(50),
[value] varchar(50)
);
INSERT INTO grps
([id], [parentid], [value])
VALUES
('-5001', '0', null),
('-5002', '-5001', null),
('-5003', '-5002', '50'),
('-5004', '-5003', null),
('-5005', '0', null),
('-5006', '0', null),
('-5007', '0', null),
('-5008', '-5006', null);
I'm trying to get parents based on the id of a child. If the id queried is the last parent then it should only return the last item.
Examples:
If I query: id = '-5004' it should return ('-5004', '-5003', null),
('-5003', '-5002', '50'),
('-5002', '-5001', null),
('-5001', '0', null)
If I query id = '-5007' it should return ('-5007', '0', null)
It would be awesome if it could list the id queried first and the rest in an orderly fashion up the "tree".
I've tried several different approaches with CTE's but with no luck unfortunately. So I'm looking for some help or ideas here.
Thanks in advance.
You were on the right track with CTE's. It can be done by using recursive CTE! Here is how the recursive CTE looks like:
DECLARE #ID varchar(50) = '5004';
WITH CTE AS
(
--This is called once to get the minimum and maximum values
SELECT id, parentid, value
FROM grps
WHERE id= #ID
UNION ALL
--This is called multiple times until the condition is met
SELECT g.id, g.parentid, g.value
FROM CTE c, grps g
WHERE g.id= c.parentid
--If you don't like commas between tables then you can replace the 2nd select
--statement with this:
--SELECT g.id, g.parentid, g.value
--FROM CTE c
--INNER JOIN grps g ON g.id= c.parentid
--This can also be written with CROSS JOINS!
--Even though it looks more like another way of writing INNER JOINs.
--SELECT g.id, g.parentid, g.value
--FROM CTE c
--CROSS JOIN grps g
--WHERE g.id = c.parentid
)
SELECT * FROM CTE
Beware that the maximum recursion is 100 unless you add option (maxrecursion 0) to the end of the last select statement. The 0 means infinite but you can also set it to any value you want.
Enjoy!
I'm trying my best to give hierarchyid some love in the world. First, the setup:
CREATE TABLE grps
(
[id] varchar(50),
[parentid] varchar(50),
[value] varchar(50),
h HIERARCHYID NULL
);
SELECT * FROM grps
INSERT INTO grps
([id], [parentid], [value])
VALUES
('-5001', '0', null),
('-5002', '-5001', null),
('-5003', '-5002', '50'),
('-5004', '-5003', null),
('-5005', '0', null),
('-5006', '0', null),
('-5007', '0', null),
('-5008', '-5006', null);
WITH cte AS (
SELECT id ,
parentid ,
value ,
CAST('/' + id + '/' AS nvarchar(max)) AS h
FROM grps
WHERE parentid = 0
UNION ALL
SELECT child.id ,
child.parentid ,
child.value ,
CAST(parent.h + child.id + '/' AS NVARCHAR(MAX)) AS h
FROM cte AS [parent]
JOIN grps AS [child]
ON child.parentid = parent.id
)
UPDATE g
SET h = c.h
FROM grps AS g
JOIN cte AS c
ON c.id = g.id
All I'm doing here is adding a hierarchyid column to your table definition and calculating the value for it. To determine answer your original problem, now it looks something like this:
SELECT g.id ,
g.parentid ,
g.value ,
g.h.ToString()
FROM dbo.grps AS g
JOIN grps AS c
ON c.h.IsDescendantOf(g.h) = 1
WHERE c.id = '-5004'
To make this more performant, you should index both the id and h columns independently (that is, in separate indexes).
Also, a couple of notes
Having the id columns be varchar when the data looks numeric is fishy at best, but more importantly it's inefficient. If it were me, I'd use an int. But perhaps your actual data is messier (i.e you have ids like 'A1234').
I'd also use NULL instead of 0 for the parentid to represent top-level (i.e. those with no parent) members. But that's more of a personal choice rather than one that has any real performance implications.

SQL return only distinct IDs from LEFT JOIN

I've inherited some fun SQL and am trying to figure out how to how to eliminate rows with duplicate IDs. Our indexes are stored in a somewhat columnar format and then we pivot all the rows into one with the values as different columns.
The below sample returns three rows of unique data, but the IDs are duplicated. I need just two rows with unique IDs (and the other columns that go along with it). I know I'll be losing some data, but I just need one matching row per ID to the query (first, top, oldest, newest, whatever).
I've tried using DISTINCT, GROUP BY, and ROW_NUMBER, but I keep getting the syntax wrong, or using them in the wrong place.
I'm also open to rewriting the query completely in a way that is reusable as I currently have to generate this on the fly (cardtypes and cardindexes are user defined) and would love to be able to create a stored procedure. Thanks in advance!
declare #cardtypes table ([ID] int, [Name] nvarchar(50))
declare #cards table ([ID] int, [CardTypeID] int, [Name] nvarchar(50))
declare #cardindexes table ([ID] int, [CardID] int, [IndexType] int, [StringVal] nvarchar(255), [DateVal] datetime)
INSERT INTO #cardtypes VALUES (1, 'Funny Cards')
INSERT INTO #cardtypes VALUES (2, 'Sad Cards')
INSERT INTO #cards VALUES (1, 1, 'Bunnies')
INSERT INTO #cards VALUES (2, 1, 'Dogs')
INSERT INTO #cards VALUES (3, 1, 'Cat')
INSERT INTO #cards VALUES (4, 1, 'Cat2')
INSERT INTO #cardindexes VALUES (1, 1, 1, 'Bunnies', null)
INSERT INTO #cardindexes VALUES (2, 1, 1, 'playing', null)
INSERT INTO #cardindexes VALUES (3, 1, 2, null, '2014-09-21')
INSERT INTO #cardindexes VALUES (4, 2, 1, 'Dogs', null)
INSERT INTO #cardindexes VALUES (5, 2, 1, 'playing', null)
INSERT INTO #cardindexes VALUES (6, 2, 1, 'poker', null)
INSERT INTO #cardindexes VALUES (7, 2, 2, null, '2014-09-22')
SELECT TOP(100)
[ID] = c.[ID],
[Name] = c.[Name],
[Keyword] = [colKeyword].[StringVal],
[DateAdded] = [colDateAdded].[DateVal]
FROM #cards AS c
LEFT JOIN #cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN #cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
ORDER BY [DateAdded]
Edit:
While both solutions are valid, I ended up using the MAX() solution from #popovitsj as it was easier to implement. The issue of data coming from multiple rows doesn't really factor in for me as all rows are essentially part of the same record. I will most likely use both solutions depending on my needs.
Here's my updated query (as it didn't quite match the answer):
SELECT TOP(100)
[ID] = c.[ID],
[Name] = MAX(c.[Name]),
[Keyword] = MAX([colKeyword].[StringVal]),
[DateAdded] = MAX([colDateAdded].[DateVal])
FROM #cards AS c
LEFT JOIN #cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN #cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
GROUP BY c.ID
ORDER BY [DateAdded]
You could use MAX or MIN to 'decide' on what to display for the other columns in the rows that are duplicate.
SELECT ID, MAX(Name), MAX(Keyword), MAX(DateAdded)
(...)
GROUP BY ID;
using row number windowed function along with a CTE will do this pretty well. For example:
;With preResult AS (
SELECT TOP(100)
[ID] = c.[ID],
[Name] = c.[Name],
[Keyword] = [colKeyword].[StringVal],
[DateAdded] = [colDateAdded].[DateVal],
ROW_NUMBER()OVER(PARTITION BY c.ID ORDER BY [colDateAdded].[DateVal]) rn
FROM #cards AS c
LEFT JOIN #cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN #cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
ORDER BY [DateAdded]
)
SELECT * from preResult WHERE rn = 1

SQL Server: Insert Multiple Rows to a table based on a column in a different table

I have a table
CREATE TABLE [StudentsByKindergarten]
(
[FK_KindergartenId] [int] IDENTITY(1,1) NOT NULL,
[StudentList] [nvarchar]
)
where the entries are
(1, "John, Alex, Sarah")
(2, "")
(3, "Jonny")
(4, "John, Alex")
I want to migrate this information to the following table.
CREATE TABLE [KindergartenStudents]
(
[FK_KindergartenId] [int] NOT NULL,
[StudentName] [nvarchar] NOT NULL)
)
so that it will have
(1, "John")
(1, "Alex")
(1, "Sarah")
(3, "Jonny")
(4, "John")
(4, "Alex")
I think I can achieve split function using something like the answer here: How do I split a string so I can access item x?
Using the function here:
http://www.codeproject.com/Articles/7938/SQL-User-Defined-Function-to-Parse-a-Delimited-Str
I can do something like this,
INSERT INTO [KindergartenStudents] ([FK_KindergartenId], [Studentname])
SELECT
sbk.FK_KindergartenId,
parsed.txt_value
FROM
[StudentsByKindergarten] sbk, dbo.fn_ParseText2Table(sbk.StudentList,',') parsed
GO
but doesn't seem to work.
Based on this question, I've learned a better approach for this problem. You just need to use CROSS APPLY with your suggested function fn_ParseText2Table.
Sample Fiddle
INSERT INTO KindergartenStudents
(FK_KindergartenId, StudentName)
SELECT
sbk.FK_KindergartenId,
parsed.txt_value
FROM
StudentsByKindergarten sbk
CROSS APPLY
fn_ParseText2Table(sbk.StudentList, ',') parsed
I've used the function that you suggested (fn_ParseText2Table) and the following T-SQL is working. You can test it with this fiddle: link.
BEGIN
DECLARE
#ID int,
#iterations int
-- Iterate the number of not empty rows
SET #iterations =
(SELECT
COUNT(*)
FROM
StudentsByKindergarten
WHERE
DATALENGTH(StudentList) > 0
)
WHILE ( #iterations > 0 )
BEGIN
-- Select the ID of row_number() = #iteration
SET #ID =
(SELECT
FK_KindergartenId
FROM
(SELECT
*,
ROW_NUMBER() OVER (ORDER BY FK_KindergartenId DESC) as rn
FROM
StudentsByKindergarten
WHERE
DATALENGTH(StudentList) > 0) rows
WHERE
rows.rn = #iterations
)
SET #iterations -= 1
-- Insert the parsed values
INSERT INTO KindergartenStudents
(FK_KindergartenId, StudentName)
SELECT
#ID,
parsed.txt_value
FROM
fn_ParseText2Table
(
(SELECT
StudentList
FROM
StudentsByKindergarten
WHERE
FK_KindergartenId = #ID),
',') parsed
END
END

how to insert multiple rows with check for duplicate rows in a short way

I am trying to insert multiple records (~250) in a table (say MyTable) and would like to insert a new row only if it does not exist already.
I am using SQL Server 2008 R2 and got help from other threads like SQL conditional insert if row doesn't already exist.
While I am able to achieve that with following stripped script, I would like to know if there is a better (short) way to do this as I
have to repeat this checking for every row inserted. Since we need to execute this script only once during DB deployment, I am not too much
worried about performance.
INSERT INTO MyTable([Description], [CreatedDate], [CreatedBy], [ModifiedDate], [ModifiedBy], [IsActive], [IsDeleted])
SELECT N'ababab', GETDATE(), 1, NULL, NULL, 1, 0
WHERE NOT EXISTS(SELECT * FROM MyTable WITH (ROWLOCK, HOLDLOCK, UPDLOCK)
WHERE
([InstanceId] IS NULL OR [InstanceId] = 1)
AND [ChannelPartnerId] IS NULL
AND [CreatedBy] = 1)
UNION ALL
SELECT N'xyz', 1, GETDATE(), 1, NULL, NULL, 1, 0
WHERE NOT EXISTS(SELECT * FROM [dbo].[TemplateQualifierCategoryMyTest] WITH (ROWLOCK, HOLDLOCK, UPDLOCK)
WHERE
([InstanceId] IS NULL OR [InstanceId] = 1)
AND [ChannelPartnerId] IS NULL
AND [CreatedBy] = 1)
-- More SELECT statements goes here
You could create a temporary table with your descriptions, then insert them all into the MyTable with a select that will check for rows in the temporary table that is not yet present in your destination, (this trick in implemented by the LEFT OUTER JOIN in conjunction with the IS NULL for the MyTable.Description part in the WHERE-Clause):
DECLARE #Descriptions TABLE ([Description] VARCHAR(200) NOT NULL )
INSERT INTO #Descriptions ( Description )VALUES ( 'ababab' )
INSERT INTO #Descriptions ( Description )VALUES ( 'xyz' )
INSERT INTO dbo.MyTable
( Description ,
CreatedDate ,
CreatedBy ,
ModifiedDate ,
ModifiedBy ,
IsActive ,
IsDeleted
)
SELECT d.Description, GETDATE(), 1, NULL, NULL, 1, 0
FROM #Descriptions d
LEFT OUTER JOIN dbo.MyTable mt ON d.Description = mt.Description
WHERE mt.Description IS NULL