How to deal with Violation of PRIMARY KEY constraint error

How to deal with Violation of PRIMARY KEY constraint error - sql

I am having a real problem with my stored procedure with this script:
INSERT INTO #tr_TxnDetails
SELECT
b.pid,
b.etc
FROM tbl_SomeTableA as a
JOIN tbl_SomeTableB as b ON a.etc = b.etc
AND a.SomeColumn = b.SomeColumn
-- This is throwing error: Violation of PRIMARY KEY constraint. Cannot insert duplicate key in object 'dbo.tr_TxnDetails'.
INSERT INTO tr_TxnDetails
([id], [etc])
SELECT a.[id],
a.[etc]
FROM #tr_TxnDetails as a
WHERE not exists (select 1 from tr_TxnDetails as b where a.[id] = b.[id]);
How do I make sure the during the INSERT INTO statement to tr_TxnDetails it is not inserting a row with the same primary key: pid ?

INSERT INTO #tr_TxnDetails
SELECT
b.pid,
b.etc
FROM tbl_SomeTableA as a
JOIN tbl_SomeTableB as b ON a.etc = b.etc
AND a.SomeColumn = b.SomeColumn
WHERE b.pid NOT IN (select distinct id from tr_TxnDetails) --<<--
INSERT INTO tr_TxnDetails
([id], [etc])
SELECT a.[id],
a.[etc]
FROM #tr_TxnDetails as a

I think that your first INSERT ... SELECT statement is producing duplicates and then these duplicates are causing primary key errors in your second select. Your WHERE EXISTS clause only guards against inserting a duplicate that is a duplicate of an existing row.
I will come to your query later, but just to show you can cause this error quite simply with the following set of statements:
create table TableA
(
Pid INT PRIMARY KEY,
etc INT
);
INSERT INTO TableA
SELECT 1, 0
UNION
SELECT 1, 2
and here is the error:
Violation of PRIMARY KEY constraint 'PK__TableA__C57059387F60ED59'. Cannot insert duplicate key in object 'dbo.TableA'.: INSERT INTO TableA SELECT 1, 0 UNION SELECT 1, 2
Now back to your query, the simple re-write is to ensure that the query only returns DISTINCT rows:
INSERT INTO #tr_TxnDetails
SELECT DISTINCT
b.pid,
b.etc
FROM tbl_SomeTableA as a
JOIN tbl_SomeTableB as b ON a.etc = b.etc
AND a.SomeColumn = b.SomeColumn
INSERT INTO tr_TxnDetails
([id], [etc])
SELECT a.[id],
a.[etc]
FROM #tr_TxnDetails as a
WHERE not exists (select 1 from tr_TxnDetails as b where a.[id] = b.[id]);
This should do the trick for you.
One further point is that in your example you should do away with the temporary table step unless there is a good reason for it, such as some other processing between those two statements. Here is the rewritten query:
INSERT INTO tr_TxnDetails
SELECT DISTINCT
b.pid,
b.etc
FROM tbl_SomeTableA as a
JOIN tbl_SomeTableB as b ON a.etc = b.etc
AND a.SomeColumn = b.SomeColumn
WHERE not exists (
select 1
from tr_TxnDetails as c
where a.[id] = C.[id]
);

DECLARE #ChoiceID INT
SET #ChoiceID = (SELECT MAX([CHOICE_ID]) FROM BI_QUESTION_CHOICE) -- FOR SOMETABLE.ID
INSERT BI_QUESTION_CHOICE
(
[choice_id],
[choice_descr],
[sequence],
[question_id],
[is_correct],
[created_by],
[created_dt],
[modified_by],
[modified_dt]
)
(SELECT #ChoiceID+ROW_NUMBER() OVER (ORDER BY #ChoiceID),
pref.value('(ChoiceText/text())[1]', 'varchar(50)'),
pref.value('(Sequence/text())[1]', 'varchar(50)') ,
#QuestionID,
pref.value('(IsCorrect/text())[1]', 'bit'),
'mbathini',
GETDATE(),
'mbathini',
GETDATE()
FROM #xmlstring.nodes('/ArrayOfBI_QA_ChoiceEntity/BI_QA_ChoiceEntity') AS Responses(pref))

Related

Looping over the insert data into multiple tables

I have a query where I need to run to do manual inserts
I can do it but there are many records and was looking if I can build something.
I have a structure somewhat like this:
Have 4 id of a table - primary key values as:
var ids = "1,2,3,4";
loop over ids {
insert into table1(col1,col2,col3)
select col1,newid(),getdate() from table1 where id = ids - 1 at a time
var selectedID = get the id of the inserted row and then insert into anotehr table as:
insert into table2(col1,col2,col3,col4)
select selectedID, getdate(),getdate(),4 from table2 where fkID = ids - one at a time
}

You can use both loops and cursors but often they can be avoided.
Is there a specific reason you note you want them inserted one at a time? An alternative would be to have the IDs staged, in a temp table, or CTE, e.g.
;WITH [Ids] AS
(
SELECT '1' AS [ID]
UNION
SELECT '2'
UNION
SELECT '3'
UNION
SELECT '4'
)
INSERT INTO [Table1]
(
[Col1],
[Col2],
[Col3]
)
SELECT [Col1],
NEWID(),
GETDATE()
FROM [Table1] T
INNER JOIN [Ids] I ON I.[ID] = T.[Id];
Which avoids the need for any loops, and should perform much better.
Edit
The way I would structure this, to make the query reusable would be as follows:
IF OBJECT_ID('tempdb..#IDS') IS NOT NULL
BEGIN
DROP TABLE #IDS
END
IF OBJECT_ID('tempdb..#Inserted_IDS') IS NOT NULL
BEGIN
DROP TABLE #Inserted_IDS
END
CREATE TABLE #IDS
(
ID INT
);
CREATE TABLE #Inserted_IDS
(
ID INT,
);
INSERT INTO #IDS
(
ID
)
SELECT 1 UNION
SELECT 2 UNION
SELECT 3 UNION
SELECT 4;
INSERT INTO [Table1]
(
[Col1],
[Col2],
[Col3]
)
OUTPUT Inserted.ID
INTO #Inserted_IDS
SELECT [Col1],
NEWID(),
GETDATE()
FROM [Table1] T
INNER JOIN #IDS I ON I.[ID] = T.[Id];
INSERT INTO [table2]
(
[col1],
[col2],
[col3],
[col4]
)
SELECT I.[ID],
getdate(),
getdate(),
4
FROM [#Inserted_IDS] I
DROP TABLE #IDS;
DROP TABLE #Inserted_IDS;
Therefore you only need to amend the IDs being entered into the temp table each time you need to do the inserts.

Insert to table based on another table if doesn’t exist

From doing a bit of research it seems it’s not possible to do an insert with a where clause?
I have a table I want to import into another table where specific record criteria doesn’t already exist. How would I go about this ?
E.g pseudo code -
insert into table b (select * from table a) where not exists tableb.column1 = tablea.column1 and tableb.column2 = tablea.column2
Could this potentially be done with a Join ?

You can insert using INSERT INTO.. SELECT with NOT EXISTS.
Query
insert into [TableB]
select * from [TableA] as [t1] -- [TableA] and [TableB] structure should be same.
where not exists(
select 1 from [TableB] as [t2]
where [t1].[column1] = [t2].[column1]
and [t1].[column2] = [t2].[column2]
);
Or, if the table structure is not same and need to same few columns, then
Query
insert into [TableB]([column1], [column2], ..., [columnN])
select [column1], [column2], ..., [columnN]
from [TableA] as [t1]
where not exists(
select 1 from [TableB] as [t2]
where [t1].[column1] = [t2].[column1]
and [t1].[column2] = [t2].[column2]
);

You can also use LEFT JOIN with IS NULL as below:
INSERT INTO tableb
SELECT a.*
FROM tablea a
LEFT JOIN tableb b ON b.column1 = a.column1
AND b.column2 = a.column2
WHERE b.column1 IS NULL

Referencing table name from INSERT statement in SELECT part will not work, because INSERT is not a query itself, but nothing prevents to query destination table in SELECT, which produces the data set to be inserted.
INSERT INTO tableb (column1, column2)
SELECT column1, column2 FROM tablea
WHERE NOT EXISTS (SELECT * FROM tableb
WHERE tableb.column1 = tablea.column1
AND tabled.column2 = tablea.column2)

Try this :
Via Insert with Select statement with Not Exists
Declare #table1 table(Id int , EmpName varchar(100) )
Declare #table2 table(Id int , EmpName varchar(100) )
Insert into #table1 values (1,'Ajay'), (2, 'Tarak') , (3,'Nirav')
Insert into #table2 values (1,'Ajay')
--Insert into table b (select * from table a) where not exists tableb.column1 = tablea.column1 and tabled.column2 = tablea.column2
INSERT INTO #table2 (id, empname)
select id, empname from #table1
EXCEPT
SELECT id, empname from #table2
Select * from #table1
Select * from #table2
Via merge
Insert into #table2 values (4,'Prakash')
Select * from #table1
Select * from #table2
Declare #Id int , #EmpName varchar(100)
;with data as (select #id as id, #empname as empname from #table1)
merge #table2 t
using data s
on s.id = t.id
and s.empname = t.empname
when not matched by target
then insert (id, empname) values (s.id, s.empname);
Select * from #table1
Select * from #table2

Constructing single select statement that returns order depends on the value of a column in SQL Server

Table1
Id bigint primary key identity(1,1)
Status nvarchar(20)
Insert dummy data
Insert into Table1 values ('Open') --1
Insert into Table1 values ('Open') --2
Insert into Table1 values ('Grabbed') --3
Insert into Table1 values ('Closed') --4
Insert into Table1 values ('Closed') --5
Insert into Table1 values ('Open') --6
How would I construct a single select statement which orders the data where records with 'Grabbed' status is first, followed by 'Closed', followed by 'Open' in SQL Server
Output:
Id Status
3 Grabbed
4 Closed
5 Closed
1 Open
2 Open
6 Open

I think you need something like this:
select *
from yourTable
order by case when Status = 'Grabbed' then 1
when Status = 'Closed' then 2
when Status = 'Open' then 3
else 4 end
, Id;
[SQL Fiddle Demo]
Another way is to using CTE like this:
;with cte as (
select 'Grabbed' [Status], 1 [order]
union all select 'Closed', 2
union all select 'Open', 3
)
select t.*
from yourTable t
left join cte
on t.[Status] = cte.[Status]
order by cte.[order], Id;
[SQL Fiddle Demo]

This could be done much better with a properly normalized design:
Do not store your Status as a textual content. Just imagine a typo (a row with Grabed)...
Further more a lookup table allows you to add side data, e.g. a sort order.
CREATE TABLE StatusLookUp(StatusID INT IDENTITY PRIMARY KEY /*you should name your constraints!*/
,StatusName VARCHAR(100) NOT NULL
,SortRank INT NOT NULL)
INSERT INTO StatusLookUp VALUES
('Open',99) --ID=1
,('Closed',50)--ID=2
,('Grabbed',10)--ID=3
CREATE TABLE Table1(Id bigint primary key identity(1,1) /*you should name your constraints!*/
,StatusID INT FOREIGN KEY REFERENCES StatusLookUp(StatusID));
Insert into Table1 values (1) --1
Insert into Table1 values (1) --2
Insert into Table1 values (3) --3
Insert into Table1 values (2) --4
Insert into Table1 values (2) --5
Insert into Table1 values (1) --6
SELECT *
FROM Table1 AS t1
INNER JOIN StatusLookUp AS s ON t1.StatusID=s.StatusID
ORDER BY s.SortRank;

I find that the simplest method uses a string:
order by charindex(status, 'Grabbed,Closed,Open')
or:
order by charindex(',' + status + ',', ',Grabbed,Closed,Open,')
If you are going to put values in the query, I think the easiest way uses values():
select t1.*
from t1 left join
(values ('Grabbed', 1), ('Closed', 2), ('Open', 3)) v(status, priority)
on t1.status = v.status
order by coalesce(v.priority, 4);
Finally. This need suggests that you should have a reference table for statuses. Rather than putting the string name in other tables, put an id. The reference table can have the priority as well as other information.

Try this:
select Id,status from tablename where status='Grabbed'
union
select Id,status from tablename where status='Closed'
union
select Id,status from tablename where status='Open'

Merging two parent > child table sets

I need to get the data in two parent > child table sets merged/combined into a third parent > child table.
The tables look like this:
The only difference in the three sets of tables is that TableC has a TableType column to help discern the difference between a TableA record and a TableB record.
My first thought was to use a cursor.. Here's code to create the table structure, insert some records, and then merge the data together. It works very well, sooooo....
--Create the tables
CREATE TABLE TableA
(
ID int not null identity primary key,
Name VARCHAR(30)
);
CREATE TABLE TableAChild
(
ID int not null identity primary key,
Parent int not null,
Name VARCHAR(30),
CONSTRAINT FK_A FOREIGN KEY (Parent) REFERENCES TableA(ID)
);
CREATE TABLE TableB
(
ID int not null identity primary key,
Name VARCHAR(30)
);
CREATE TABLE TableBChild
(
ID int not null identity primary key,
Parent int not null,
Name VARCHAR(30),
CONSTRAINT FK_B FOREIGN KEY (Parent) REFERENCES TableB(ID)
);
CREATE TABLE TableC
(
ID int not null identity primary key,
TableType VARCHAR(1),
Name VARCHAR(30)
);
CREATE TABLE TableCChild
(
ID int not null identity primary key,
Parent int not null,
Name VARCHAR(30),
CONSTRAINT FK_C FOREIGN KEY (Parent) REFERENCES TableC(ID)
);
-- Insert some test records..
INSERT INTO TableA (Name) Values ('A1')
INSERT INTO TableAChild (Name, Parent) VALUES ('A1Child', SCOPE_IDENTITY())
INSERT INTO TableB (Name) Values ('B1')
INSERT INTO TableBChild (Name, Parent) VALUES ('B1Child', SCOPE_IDENTITY())
-- Needed throughout..
DECLARE #ID INT
-- Merge TableA and TableAChild into TableC and TableCChild
DECLARE TableACursor CURSOR
-- Get the primary key from TableA
FOR SELECT ID FROM TableA
OPEN TableACursor
FETCH NEXT FROM TableACursor INTO #ID
WHILE ##FETCH_STATUS = 0
BEGIN
-- INSERT INTO SELECT the parent record into TableC, being sure to specify a TableType
INSERT INTO TableC (Name, TableType) SELECT Name, 'A' FROM TableA WHERE ID = #ID
-- INSERT INTO SELECT the child record into TableCChild using the parent ID of the last row inserted (SCOPE_IDENTITY())
-- and the current record from the cursor (#ID).
INSERT INTO TableCChild(Name, Parent) SELECT Name, SCOPE_IDENTITY() FROM TableAChild WHERE Parent = #ID
FETCH NEXT FROM TableACursor INTO #ID
END;
CLOSE TableACursor
DEALLOCATE TableACursor
-- Repeat for TableB
DECLARE TableBCursor CURSOR
FOR SELECT ID FROM TableB
OPEN TableBCursor
FETCH NEXT FROM TableBCursor INTO #ID
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT INTO TableC (Name, TableType) SELECT Name, 'B' FROM TableB WHERE ID = #ID
INSERT INTO TableCChild(Name, Parent) SELECT Name, SCOPE_IDENTITY() FROM TableBChild WHERE Parent = #ID
FETCH NEXT FROM TableBCursor INTO #ID
END;
CLOSE TableBCursor
DEALLOCATE TableBCursor
Now, my question(s):
I've always been told that cursors are bad. But I couldn't find another way of doing it. I'm wondering if there's some way to do that with a CTE?
If the cursor is appropriate in this situation, how did I do? Is there a better way of doing what I did? It doesn't look very DRY to me, but I'm no SQL expert.
Lastly, if you want to re-run the query above, here's a small script to delete the tables that were created.
DROP TABLE TableAChild
DROP TABLE TableBChild
DROP TABLE TableCChild
DROP TABLE TableA
DROP TABLE TableB
DROP TABLE TableC
The correct result should look like:

You can use merge as described by Adam Machanic in Dr. OUTPUT or: How I Learned to Stop Worrying and Love the MERGE and in this question to get a mapping between the new identity value and the old primary key value in a table variable and the use that when you insert to your child tables.
declare #T table(ID int, IDC int);
merge dbo.TableC as C
using dbo.TableA as A
on 0 = 1
when not matched by target then
insert (TableType, Name) values('A', A.Name)
output A.ID, inserted.ID into #T(ID, IDC);
insert into dbo.TableCChild(Parent, Name)
select T.IDC, AC.Name
from dbo.TableAChild as AC
inner join #T as T
on AC.Parent = T.ID;
delete from #T;
merge dbo.TableC as C
using dbo.TableB as B
on 0 = 1
when not matched by target then
insert (TableType, Name) values('B', B.Name)
output B.ID, inserted.ID into #T(ID, IDC);
insert into dbo.TableCChild(Parent, Name)
select T.IDC, BC.Name
from dbo.TableBChild as BC
inner join #T as T
on BC.Parent = T.ID;
SQL Fiddle

Here is one way to do this without a cursor or other RBAR type stuff.
ALTER TABLE TableC ADD LegacyID INT
GO
INSERT INTO TableC (TableType, Name, LegacyID)
SELECT 'A', Name, ID
FROM TableA
INSERT TableCChild
SELECT C.ID, AC.Name
FROM TableAChild AC
JOIN TableA A ON A.Id = AC.ID
JOIN TableC C ON C.LegacyID = A.ID AND C.TableType = 'A'
INSERT INTO TableC (TableType, Name, LegacyID)
SELECT 'B', Name, ID
FROM TableB
INSERT TableCChild
SELECT C.ID, AC.Name
FROM TableBChild AC
JOIN TableB A ON A.Id = AC.ID
JOIN TableC C ON C.LegacyID = A.ID AND C.TableType = 'B'
ALTER TABLE TableC DROP COLUMN LegacyID
GO

You can use a map table to link the old and new ids together based on some key.
In my example, I am using the order of insertion into TableC.
Create a map table with an identity column.
Add data in TableC table based on order of ID of TableA and get the inserted ids in the map
Use the same order of TableA.id to get a ROWNUMBER() and match it with the identity column of the map table and update the old_id in map to match TableA.id with TableC.id .
Use the map to insert into the TableCChild table
Truncate the map and rinse and repeat for other tables.
Sample Query
CREATE TABLE #map(id int identity,new_id int,old_id int);
INSERT INTO TableC
(
TableType,
Name
)output inserted.id into #map(new_id)
SELECT 'A',Name
FROM TableA
ORDER BY ID
update m
set m.old_id = ta.id
FROM #map m
inner join
(
select row_number()OVER(order by id asc) rn,id
from tableA
)ta on ta.rn = m.id
INSERT INTO TableCChild (Name, Parent)
SELECT Name,M.new_ID
FROM #Map M
INNER JOIN TableAChild TA ON M.old_id = TA.Parent
TRUNCATE TABLE #map
INSERT INTO TableC
(
TableType,
Name
)output inserted.id into #map(new_id)
SELECT 'B',Name
FROM TableB
ORDER BY ID
update m
set m.old_id = tb.id
FROM #map m
inner join
(
select row_number()OVER(order by id asc) rn,id
from tableB
)tb on tb.rn = m.id
INSERT INTO TableCChild (Name, Parent)
SELECT Name,M.new_ID
FROM #Map M
INNER JOIN TableBChild TB ON M.old_id = TB.Parent
DROP TABLE #Map

I just wrote the following SQL to do it if the Name is unique in TableA and unique in TableB
INSERT INTO TableCChild
(
Parent,
NAME
)
SELECT tc.ID,
ta.Name
FROM TableAChild AS ta
JOIN TableA a
ON a.ID = ta.Parent
JOIN TableC AS tc
ON tc.Name = a.Name
AND tc.TableType = 'A'
UNION
SELECT tc.ID,
tb.Name
FROM TableBChild AS tb
JOIN TableB b
ON b.ID = tb.Parent
JOIN TableC AS tc
ON tc.Name = b.Name
AND tc.TableType = 'B'
If Name is not unique and only the ID is the Unique Identifier then I would add the LegacyId as suggested and the code would then be as follows
/* Change Table C to Have LegacyId as well and this is used to find the New Key for Inserts
CREATE TABLE TableC
(
ID INT NOT NULL IDENTITY PRIMARY KEY,
TableType VARCHAR(1),
LegacyId INT,
NAME VARCHAR(30)
);
*/
INSERT INTO TableC (Name, TableType, LegacyId)
SELECT DISTINCT NAME,
'A',
Id
FROM TableA
UNION
SELECT DISTINCT NAME,
'B',
Id
FROM TableB
INSERT INTO TableCChild
(
Parent,
NAME
)
SELECT tc.ID,
ta.Name
FROM TableAChild AS ta
JOIN TableA a
ON a.ID = ta.Parent
JOIN TableC AS tc
ON tc.LegacyId = a.Id
AND tc.TableType = 'A'
UNION
SELECT tc.ID,
tb.Name
FROM TableBChild AS tb
JOIN TableB b
ON b.ID = tb.Parent
JOIN TableC AS tc
ON tc.LegacyId = b.Id
AND tc.TableType = 'B'

We can reach this by turning the Identity column off till we finish the insertion like the following example.
--Create the tables
CREATE TABLE TableA
(
ID int not null identity primary key,
Name VARCHAR(30)
);
CREATE TABLE TableAChild
(
ID int not null identity primary key,
Parent int not null,
Name VARCHAR(30),
CONSTRAINT FK_A FOREIGN KEY (Parent) REFERENCES TableA(ID)
);
CREATE TABLE TableB
(
ID int not null identity primary key,
Name VARCHAR(30)
);
CREATE TABLE TableBChild
(
ID int not null identity primary key,
Parent int not null,
Name VARCHAR(30),
CONSTRAINT FK_B FOREIGN KEY (Parent) REFERENCES TableB(ID)
);
CREATE TABLE TableC
(
ID int not null identity primary key,
TableType VARCHAR(1),
Name VARCHAR(30)
);
CREATE TABLE TableCChild
(
ID int not null identity primary key,
Parent int not null,
Name VARCHAR(30),
CONSTRAINT FK_C FOREIGN KEY (Parent) REFERENCES TableC(ID)
);
-- Insert some test records..
INSERT INTO TableA (Name) Values ('A1')
INSERT INTO TableAChild (Name, Parent) VALUES ('A1Child', SCOPE_IDENTITY())
INSERT INTO TableB (Name) Values ('B1')
INSERT INTO TableBChild (Name, Parent) VALUES ('B1Child', SCOPE_IDENTITY())
SET IDENTITY_INSERT TableC ON
INSERT INTO TableC(ID, TableType, Name)
SELECT ID, 'A', Name FROM TableA
INSERT INTO TableCChild(Parent, Name)
SELECT Parent, Name FROM TableAChild
DECLARE #MAXID INT
SELECT #MAXID = MAX(ID) FROM TableC
PRINT #MAXID
SET IDENTITY_INSERT TableC ON
INSERT INTO TableC(ID, TableType, Name)
SELECT ID + #MAXID, 'B', Name FROM TableB
SET IDENTITY_INSERT TableC OFF
INSERT INTO TableCChild(Parent, Name)
SELECT Parent + #MAXID, Name FROM TableBChild
SET IDENTITY_INSERT TableC OFF
SELECT * FROM TableC
SELECT * FROM TableCChild
DROP TABLE TableAChild
DROP TABLE TableBChild
DROP TABLE TableCChild
DROP TABLE TableA
DROP TABLE TableB
DROP TABLE TableC

If you need to insert records in third table TableC and TableCChild for later use then it's fine to insert data in these tables but if you only need this table data to use it in your stored procedure for the time being then you can also just work with first two tables to get the desired result.
select * from (
select a.ID,'A' as TableType,a.Name from TableA a inner join TableAChild b on a.ID=b.ID
union
select a.ID,'B' as TableType,a.Name from TableB a inner join TableBChild b on a.ID=b.ID) TableC
Similarly get TableCChild
select * from
(
select b.ID,b.Parent,b.Name from TableA a inner join TableAChild b on a.ID=b.ID
union
select b.ID,b.Parent,b.Name from TableB a inner join TableBChild b on a.ID=b.ID) TableCChild
And if you have to insert in TableC and TableCChild then you have to recreate TableC with primary key on ID and TableType, and turn off the identity for ID column.

Delete multiple duplicate rows in table

I have multiple groups of duplicates in one table (3 records for one, 2 for another, etc) - multiple rows where more than 1 exists.
Below is what I came up with to delete them, but I have to run the script for however many duplicates there are:
set rowcount 1
delete from Table
where code in (
select code from Table
group by code
having (count(code) > 1)
)
set rowcount 0
This works well to a degree. I need to run this for every group of duplicates, and then it only deletes 1 (which is all I need right now).

If you have a key column on the table, then you can use this to uniquely identify the "distinct" rows in your table.
Just use a sub query to identify a list of ID's for unique rows and then delete everything outside of this set. Something along the lines of.....
create table #TempTable
(
ID int identity(1,1) not null primary key,
SomeData varchar(100) not null
)
insert into #TempTable(SomeData) values('someData1')
insert into #TempTable(SomeData) values('someData1')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData3')
insert into #TempTable(SomeData) values('someData4')
select * from #TempTable
--Records to be deleted
SELECT ID
FROM #TempTable
WHERE ID NOT IN
(
select MAX(ID)
from #TempTable
group by SomeData
)
--Delete them
DELETE
FROM #TempTable
WHERE ID NOT IN
(
select MAX(ID)
from #TempTable
group by SomeData
)
--Final Result Set
select * from #TempTable
drop table #TempTable;
Alternatively you could use a CTE for example:
WITH UniqueRecords AS
(
select MAX(ID) AS ID
from #TempTable
group by SomeData
)
DELETE A
FROM #TempTable A
LEFT outer join UniqueRecords B on
A.ID = B.ID
WHERE B.ID IS NULL

It is frequently more efficient to copy unique rows into temporary table,
drop source table, rename back temporary table.
I reused the definition and data of #TempTable, called here as SrcTable instead, since it is impossible to rename temporary table into a regular one)
create table SrcTable
(
ID int identity(1,1) not null primary key,
SomeData varchar(100) not null
)
insert into SrcTable(SomeData) values('someData1')
insert into SrcTable(SomeData) values('someData1')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData3')
insert into SrcTable(SomeData) values('someData4')
by John Sansom in previous answer
-- cloning "unique" part
SELECT * INTO TempTable
FROM SrcTable --original table
WHERE id IN
(SELECT MAX(id) AS ID
FROM SrcTable
GROUP BY SomeData);
GO;
DROP TABLE SrcTable
GO;
sys.sp_rename 'TempTable', 'SrcTable'

You can alternatively use ROW_NUMBER() function to filter out duplicates
;WITH [CTE_DUPLICATES] AS
(
SELECT RN = ROW_NUMBER() OVER (PARTITION BY SomeData ORDER BY SomeData)
FROM #TempTable
)
DELETE FROM [CTE_DUPLICATES] WHERE RN > 1

SET ROWCOUNT 1
DELETE Table
FROM Table a
WHERE (SELECT COUNT(*) FROM Table b WHERE b.Code = a.Code ) > 1
WHILE ##rowcount > 0
DELETE Table
FROM Table a
WHERE (SELECT COUNT(*) FROM Table b WHERE b.Code = a.Code ) > 1
SET ROWCOUNT 0
this will delete all duplicate rows, But you can add attributes if you want to compare according to them .

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to deal with Violation of PRIMARY KEY constraint error - sql

INSERT INTO #tr_TxnDetails SELECT b.pid, b.etc FROM tbl_SomeTableA as a JOIN tbl_SomeTableB as b ON a.etc = b.etc AND a.SomeColumn = b.SomeColumn WHERE b.pid NOT IN (select distinct id from tr_TxnDetails) --<<-- INSERT INTO tr_TxnDetails ([id], [etc]) SELECT a.[id], a.[etc] FROM #tr_TxnDetails as a

Related

Looping over the insert data into multiple tables

Insert to table based on another table if doesn’t exist

Constructing single select statement that returns order depends on the value of a column in SQL Server

Merging two parent > child table sets

Delete multiple duplicate rows in table

Categories

Resources