Loop through table data and compare using split - sql

I have two tables and i need to compare data and update/insert one table records. What iam trying to do is I need to take each record from Table1,
use a split function then for each text in split, compare dataelement field between both these tables. We are syncing data in Table2 to similar to Table1.
Please let me know how this can be done. I am ok using cursor or merge. This is the scenario
DataTable:
dataId dataelement
1 Check
2 System
3 Balances
4 City
5 State
6 Zip
7 Other
Table1:
Id reqId dataelementValues
1 52 Check
2 52 City;State;System
3 52 Other
Table2:
elId dataId dataelement reqId Active
1 6 Zip 52 1
2 1 Check 52 1
3 4 city 52 1
4 5 State 52 1
Outcome Should be similar to after compare in table2
Table2:
elId dataId dataelement reqId Active
1 6 Zip 52 0 (Should be set to inactive as it exists in table2 but not in table1)
2 1 Check 52 1 (NO Updates as it exists in both the tables)
3 4 city 52 1 (NO Updates as it exists in both the tables)
4 5 State 52 1 (NO Updates as it exists in both the tables)
5 2 System 52 1 (Get the dataid for system from datatable and insert in table2 as it exists in table1 but not in table2)
6 7 Other 52 1 (Get the dataid for other from datatable and insert in table2 as it exists in table1 but not in table2)
This is where iam at, not sure how to set inactive on table2.
WHILE Exists(Select * from #Table1)
BEGIN
Select #currentId = Id, #dataValue = dataelementValues FROM #Table1 where rowID=(SELECT top 1 rowID from #Table1 order by rowID asc)
SET #pos = 0
SET #len = 0
WHILE CHARINDEX(';', #dataValue, #pos+1)>0
BEGIN
SET #dataValueValue = SUBSTRING(#dataValue, #pos, CHARINDEX('|', #dataValue, #pos+1) - #pos)
SET #glbaDEId = (Select DataTable.dataId from datatable where dataelement = #dataValue)
IF NOT Exists (Select * from #Table2 Where DataElement=#dataValue)
BEGIN
--Insert into table2
END
SET #pos = CHARINDEX('|', #dataValue, #pos+#len) +1
END
DELETE from #Table1 where rowID=(SELECT top 1 rowID from #Table1 order by rowID asc )
END

You can try using a MERGE statement with a few other tricks.
Merge Guide
-- Create a CTE that will split out the combined column and join to DataTable
-- to get the dataId
;WITH cteTable1Split AS
(
SELECT reqId, dt.* FROM
(
SELECT
[dataelement] = y.i.value('(./text())[1]', 'nvarchar(4000)'),
reqId
FROM
(
-- use xml to split column
-- http://sqlperformance.com/2012/07/t-sql-queries/split-strings
SELECT x = CONVERT(XML, '<i>'
+ REPLACE([dataelementValues], ';', '</i><i>')
+ '</i>').query('.'),
reqId
FROM Table1
) AS a CROSS APPLY x.nodes('i') AS y(i)
) a
JOIN DataTable dt ON dt.[dataelement] = a.[dataelement]
)
-- Merge Table2 with the CTE
MERGE INTO Table2 AS Target
USING cteTable1Split AS Source
ON Target.[dataelement] = Source.[dataelement]
-- If exists in Target (Table2) but not Source (CTE) then UPDATE Active flag
WHEN NOT MATCHED BY Source THEN
UPDATE SET ACTIVE = 0
-- If exists in Source (CTE) but not Target (Table2) then INSERT new record
WHEN NOT MATCHED BY TARGET THEN
INSERT ([dataId], [dataelement], [reqId], [Active])
VALUES (SOURCE.[dataId], SOURCE.[dataelement], SOURCE.[reqId], 1);
SQL Fiddle

You've not mentioned whether you have control over the structure of these tables, so I'm going to go ahead and suggest you redesign Table1 to normalise the dataelementValues column.
That is, instead of this:
Table1:
Id reqId dataelementValues
1 52 Check
2 52 City;State;System
3 52 Other
You should be storing this:
Table1_New:
Id reqId dataelementValues
1 52 Check
2 52 City
2 52 State
2 52 System
3 52 Other
You may also need a new, surrogate, primary key column on the table, using an IDENTITY(1,1) specification.
Storing your data like this is how relational databases are intended to be used/designed. As well as simplifying the problem at hand right now, you might find it removes potential problems in the future as well.

The main challenge here is creating a rowset with the dataelementValues correctly split into separate rows. The accepted answer shows clearly how this can be used as the source for a merge statement to achieve the update and insert operation.
However, there are alternative ways to create the split or normalised rowset.
One way, which I personally prefer for clarity over resorting to xml in this particular case, is a like join. This takes advantage of the fact that you already have the separate rows you need in DataTable, just not with all the columns you need.
Select DT.dataId, DT.dataelement, T1.reqid
From Table1 T1
Inner join DataTable DT
On T1.dataelementValues like '%' + DT.dataelement + '%'
I've not been able to test this right now but should give the rowset you need, because the like operator causes one T1 row to match three times to the corresponding DT rows.

Related

SQL Merge table rows based on IN criteria

I have a table result like following
Code Counts1 Counts2 TotalCounts
1 10 20 30
4 15 18 33
5 5 14 19
... ... ... ...
What I am trying to achieve is merging counts for all rows where Code (the column counts are grouped on) belongs IN (1,4). However, within all my research, all I found was methods to merge rows based on a common value for each row (same id, etc.)
Is there a way to merge rows based on IN criteria, so I know if I should research it further?
How about a union?
select
1 as Code,
sum(Counts1) as Counts1,
sum(Counts2) as Counts2,
sum(TotalCount) as TotalCounts
from
YourTable
where
code in (1,4)
union
select *
from
YourTable
where
code not in(1,4)
Just assuming you will have numerous groupings (See the #Groupings mapping table)
You can have dynamic groupings via a LEFT JOIN
Example
Declare #YourTable Table ([Code] varchar(50),[Counts1] int,[Counts2] int,[TotalCounts] int)
Insert Into #YourTable Values
(1,10,20,30)
,(4,15,18,33)
,(5,5,14,19)
Declare #Groupings Table (Code varchar(50),Grp int)
Insert Into #Groupings values
(1,1)
,(4,1)
select code = Isnull(B.NewCode,A.Code)
,Counts1 = sum(Counts1)
,Counts2 = sum(Counts2)
,TotalCounts = sum(TotalCounts)
From #YourTable A
Left Join (
Select *
,NewCode = (Select Stuff((Select ',' + Code From #Groupings Where Grp=B1.Grp For XML Path ('')),1,1,'') )
From #Groupings B1
) B on (A.Code=B.Code)
Group By Isnull(B.NewCode,A.Code)
Returns
code Counts1 Counts2 TotalCounts
1,4 25 38 63
5 5 14 19
If it helps with the Visualization, the subquery generates
Code Grp NewCode
1 1 1,4
4 1 1,4
sum the count, remove code from the select statement. Add a new column to group 1 and 4 using case statement lets name this groupN. then in SQL group it by groupN.
You are correct, grouping has to be based on common value. so by creating a new column, you are making that happen.

SQL Server find unique combinations

I have a table
rate_id service_id
1 1
1 2
2 1
2 3
3 1
3 2
4 1
4 2
4 3
I need to find and insert in a table the unique combinations of sevice_ids by rate_id...but when the combination is repeated in another rate_id I do not want it to be inserted
In the above example there are 3 combinations
1,2 1,3 1,2,3
How can I query the first table to get the unique combinations?
Thanx!
Try doing something like this:
DECLARE #TempTable TABLE ([rate_id] INT, [service_id] INT)
INSERT INTO #TempTable
VALUES (1,1),(1,2),(2,1),(2,3),(3,1),(3,2),(4,1),(4,2),(4,3)
SELECT DISTINCT
--[rate_id], --include if required
(
SELECT
CAST(t2.[service_id] AS VARCHAR) + ' '
FROM
#TempTable t2
WHERE
t1.[rate_id] = t2.[rate_id]
ORDER BY
t2.[rate_id]
FOR XML PATH ('')
) AS 'Combinations'
FROM
#TempTable t1
I put the values in a table variable just for ease of testing the SELECT query.

Copy Data and increment PK in destination table

I have a temp table with data that needs to be split into 3 other tables. Each of those tables has a primary key that is not shared with each other or with the temp table. Here is a small sampling:
Table 1
RSN AGENT STATUS STATUS DECRIPTION
0 280151 51 Terminated
1 86 57 C/O Comp Agent Dele
2 94 57 C/O Comp Agent Dele
3 108 51 Terminated
Table 2
RSN AGENT CITY
1 10 Englewood
2 123 Jackson
3 35 Eatontown
4 86 Trenton
Table 3
RSN AGT SIGN NO_EMP START_DATE
0 241008 Y 1 2002-10-31 00:00:00.000
1 86 Y 0 2002-10-24 09:51:10.247
2 94 Y 0 2002-10-24 09:51:10.247
3 108 Y 0 2002-10-24 09:51:10.247
I need to check each table to see if the data in the temp table exists and if it does not I want to insert those rows with a RSN# starting with the max number in that table. So if I have 5000 records in the first table and I am adding 5000 new rows they will be numbered 5001 through 10000.
I then need to check to see if any columns have changed for matching rows and update them.
Thanks in advance for your assistance.
Scott
You have to repeat the code bellow for T1, 2 and 3 and update matching and not matching columns.
Insert new value:
Insert Into Table1(col1, col2, ...)
Select t.col1, t.col2
From temp as t
Left Join table1 as t1 On t.matchcol1 = t1.matchcol1 and t.matchcol2 = t1.matchcol2
Where t.col1 is null
replace matchcol1 by a list of matching columns between T and T1
update:
Update t1 set col1 = t.col1, t.col2 = t1.col2, ...
From table1 as t1
Inner Join temp as t On t.matchcol1 = t1.matchcol1 and t.matchcol2 = t1.matchcol2 and ...
Where col1 <> t.col1 or t.col2 <> t1.col2 or ...
This may work as well:
I am not sure you really need to update something or just insert and how you link temp and table1 in order to know if it has been changed.
Insert Into Table1(RSN, AGENT, STATUS, STATUS, DECRIPTION)
Select (Select max(RSN) From table1) + Row_number() over(order by agent)
, AGENT, STATUS, STATUS, DECRIPTION
From (
Select AGENT, STATUS, STATUS, DECRIPTION From TempTable
Except
Select AGENT, STATUS, STATUS, DECRIPTION From Table1
) as t1
Or you can upgrade to SQL Server 2008 and use Merge. It would be a lot easier
I ended up adding 4 new columns to my staging table; a temp rsn#, which is an identity column starting with 1, and an rsn# for each of my 3 destination tables. I created a variable getting the max value from each table and then added that to my temp rsn#.

how to remove duplicate records but need to keep their child table rows and tag them to survived row

can you please help me in writing the query for both table
database : sql server
master_table
primary name
1 a
2 a
3 a
4 b
5 b
6 c
7 c
foreign
key
reference
above table
1 aa
2 aaa
3 aaaa
4 bb
5 bbb
6 cc
7 ccc
expected output
now I need to remove duplicate from above table based upon name
after removing duplicates name
master_table
primary name
1 a
4 b
6 c
to remove duplicate records but need to keep their child table rows and tag them to survived row
foreign
key
reference
above table
fk name_city
1 aa
1 aaa
1 aaaa
4 bb
4 bbb
6 cc
6 ccc
can you please help me in writing the query for both table
database : sql server
Thanks Gordon Linoff for reply
Let me add more detail
how I think it can be done
added rownum to master_table based upon duplicated on name
primary name row_num
1 a 1
2 a 2
3 a 3
4 b 1
5 b 2
6 c 1
7 c 2
foreign
key
reference
above
table
fk name_city (map_name |get primarykey from above
based | table with joining condition
upon |map_name=name
matching |and rownum = 1)
fk
with
primary )
-----------------------------------------------------------------------
1 aa a 1
2 aaa a 1
3 aaaa a 1
4 bb b 4
5 bbb b 4
6 cc c 6
7 ccc c 6
Please suggest if this is the right way
Thanks a lot for your time and kind
CREATE TABLE #master (ID INT, Name VARCHAR(50)); --DROP TABLE #master
INSERT INTO #master VALUES (1, 'a')
INSERT INTO #master VALUES (2, 'a')
INSERT INTO #master VALUES (3, 'a')
INSERT INTO #master VALUES (4, 'b')
INSERT INTO #master VALUES (5, 'b')
INSERT INTO #master VALUES (6, 'c')
INSERT INTO #master VALUES (7, 'c')
-- create temporary mapping table
;WITH cte AS
(
SELECT ID, MIN(ID) OVER (PARTITION BY Name) AS [MinID]
FROM #master
)
SELECT *
INTO #TempMapping -- DROP TABLE #TempMapping
FROM cte
WHERE cte.ID <> cte.MinID;
-- check to make sure that the IDs mapped as expected
SELECT * FROM #TempMapping;
-- change FKed values to their respective MIN mappings
UPDATE nc
SET nc.fk = tmp.MinID
FROM name_city nc
INNER JOIN #TempMapping tmp
ON tmp.ID = nc.fk;
-- remove non-MIN IDs from master now that nothing references them
DELETE mstr
FROM #master mstr
INNER JOIN #TempMapping tmp
ON tmp.ID = mstr.ID;
If there are a lot of rows in the [name_city] table or concurrency issues (i.e. blocking), then the #TempMapping table should probably be a real table (e.g. "dbo.TempMasterMappings") instead of a temp table. At that point, you can do this one ID at a time in a loop to keep the transactions smaller and quicker. Just replace the UPDATE and DELETE queries above with the following (which can even be run from a stored procedure). This method will work for any number of millions of rows (assuming that there is an index on the [fk] field, which there should be anyway).
DECLARE #BatchSize INT; -- this can be an input param for a proc
SET #BatchSize = 5000;
DECLARE #CurrentIDtoChange INT,
#CurrentNewID INT;
BEGIN TRY
WHILE (1 = 1)
BEGIN
SELECT TOP (1)
#CurrentIDtoChange = map.ID,
#CurrentNewID = map.MinID
FROM dbo.TempMasterMappings map
ORDER BY map.ID ASC;
IF (##ROWCOUNT = 0)
BEGIN
DROP TABLE dbo.TempMasterMappings; -- clean up!
BREAK; -- exit outer loop
END;
WHILE (1 = 1)
BEGIN
UPDATE TOP (#BatchSize) nc
SET nc.fk = #CurrentNewID
FROM dbo.name_city nc
WHERE nc.fk = #CurrentIDtoChange
OPTION (OPTIMIZE FOR UNKNOWN);
IF (##ROWCOUNT = 0)
BEGIN
DELETE mstr -- clean up PK record
FROM dbo.[Master] mstr
WHERE mstr.ID = #CurrentIDtoChange;
DELETE tmm -- remove ID as it is fully migrated!
FROM dbo.TempMasterMappings tmm
WHERE tmm.ID = #CurrentIDtoChange;
BREAK; -- exit inner loop
END;
END;
END TRY
BEGIN CATCH
DECLARE #Message NVARCHAR(4000) = ERROR_MESSAGE();
RAISERROR(#Message, 16, 1);
END CATCH;
You need to replace all the ids in the second table with the minimum matching id in the first, if I understand correctly.
This query should return the result set you want:
select mt.minid, name_city
from (select t.*, min(id) over (partition by name) as minid
from master_table t
) mt join
table2 t2
on t2.id = t.id;
It is unclear from the question whether you just want to get the right output, or whether you want to modify the tables. Updating the tables would basically be changing the above select to a similar update query and then deleting the extra rows from the master table.

How do you find a missing number in a table field starting from a parameter and incrementing sequentially?

Let's say I have an sql server table:
NumberTaken CompanyName
2 Fred 3 Fred 4 Fred 6 Fred 7 Fred 8 Fred 11 Fred
I need an efficient way to pass in a parameter [StartingNumber] and to count from [StartingNumber] sequentially until I find a number that is missing.
For example notice that 1, 5, 9 and 10 are missing from the table.
If I supplied the parameter [StartingNumber] = 1, it would check to see if 1 exists, if it does it would check to see if 2 exists and so on and so forth so 1 would be returned here.
If [StartNumber] = 6 the function would return 9.
In c# pseudo code it would basically be:
int ctr = [StartingNumber]
while([SELECT NumberTaken FROM tblNumbers Where NumberTaken = ctr] != null)
ctr++;
return ctr;
The problem with that code is that is seems really inefficient if there are thousands of numbers in the table. Also, I can write it in c# code or in a stored procedure whichever is more efficient.
Thanks for the help
Fine, if this question isn't going to be closed, I may as well Copy and paste my answer from the other one:
I called my table Blank, and used the following:
declare #StartOffset int = 2
; With Missing as (
select #StartOffset as N where not exists(select * from Blank where ID = #StartOffset)
), Sequence as (
select #StartOffset as N from Blank where ID = #StartOffset
union all
select b.ID from Blank b inner join Sequence s on b.ID = s.N + 1
)
select COALESCE((select N from Missing),(select MAX(N)+1 from Sequence))
You basically have two cases - either your starting value is missing (so the Missing CTE will contain one row), or it's present, so you count forwards using a recursive CTE (Sequence), and take the max from that and add 1
Tables:
create table Blank (
ID int not null,
Name varchar(20) not null
)
insert into Blank(ID,Name)
select 2 ,'Fred' union all
select 3 ,'Fred' union all
select 4 ,'Fred' union all
select 6 ,'Fred' union all
select 7 ,'Fred' union all
select 8 ,'Fred' union all
select 11 ,'Fred'
go
I would create a temp table containing all numbers from StartingNumber to EndNumber and LEFT JOIN to it to receive the list of rows not contained in the temp table.
If NumberTaken is indexed you could do it with a join on the same table:
select T.NumberTaken -1 as MISSING_NUMBER
from myTable T
left outer join myTable T1
on T.NumberTaken= T1.NumberTaken+1
where T1.NumberTaken is null and t.NumberTaken >= STARTING_NUMBER
order by T.NumberTaken
EDIT
Edited to get 1 too
1> select 1+ID as ID from #b as b
where not exists (select 1 from #b where ID = 1+b.ID)
2> go
ID
-----------
5
9
12
Take max(1+ID) and/or add your starting value to the where clause, depending on what you actually want.