removing duplicates from table without using temporary table

removing duplicates from table without using temporary table - sql

I've a table(TableA) with contents like this:
Col1
-----
A
B
B
B
C
C
D
i want to remove just the duplicate values without using temporary table in Microsoft SQL Server. can anyone help me?
the final table should look like this:
Col1
-----
A
B
C
D
thanks :)

WITH TableWithKey AS (
SELECT ROW_NUMBER() OVER (ORDER BY Col1) As id, Col1 As val
FROM TableA
)
DELETE FROM TableWithKey WHERE id NOT IN
(
SELECT MIN(id) FROM TableWithKey
GROUP BY val
)

Can you use the row_number() function (http://msdn.microsoft.com/en-us/library/ms186734.aspx) to partition by the columns you're looking for dupes on, and delete where row number isn't 1?

I completely agree that having a unique identifier will save you a lot of time.
But if you can't use one (or if this is purely hypothetical), here's an alternative: Determine the number of rows to delete (the count of each distinct value -1), then loop through and delete top X for each distinct value.
Note that I'm not responsible for the number of kittens that are killed every time you use dynamic SQL.
declare #name varchar(50)
declare #sql varchar(max)
declare #numberToDelete varchar(10)
declare List cursor for
select name, COUNT(name)-1 from #names group by name
OPEN List
FETCH NEXT FROM List
INTO #name,#numberToDelete
WHILE ##FETCH_STATUS = 0
BEGIN
IF #numberToDelete > 0
BEGIN
set #sql = 'delete top(' + #numberToDelete + ') from #names where name=''' + #name + ''''
print #sql
exec(#sql)
END
FETCH NEXT FROM List INTO #name,#numberToDelete
END
CLOSE List
DEALLOCATE List
Another alternative would to be create a view with a generated identity. In this way you could map the values to a unique identifer (allowing for conventional delete) without making a permanent addition to your table.

Select grouped data to temp table, then truncate original, after that move back it to original.
Second solution, I am not sure will it work but you can try open table directly from SQL Management Studio and use CTRL + DEL on selected rows to delete them. That is going to be extremely slowly because you need to delete every single row by hands.

You can remove duplicate rows using a cursor and DELETE .. WHERE CURRENT OF.
CREATE TABLE Client ([name] varchar(100))
INSERT Client VALUES('Bob')
INSERT Client VALUES('Alice')
INSERT Client VALUES('Bob')
GO
DECLARE #history TABLE (name varchar(100) not null)
DECLARE #cursor CURSOR, #name varchar(100)
SET #cursor = CURSOR FOR SELECT name FROM Client
OPEN #cursor
FETCH NEXT FROM #cursor INTO #name
WHILE ##FETCH_STATUS = 0
BEGIN
IF #name IN (SELECT name FROM #history)
DELETE Client WHERE CURRENT OF #cursor
ELSE
INSERT #history VALUES (#name)
FETCH NEXT FROM #cursor INTO #name
END

Related

T-SQL script to update numerous prod tables from temp tables

I am trying to simplify the updating of numerous production tables from temp tables. I currently have a long script that uses the MERGE command, but as I increase the number of tables to be processed, the MERGE code is getting unwieldy. To simplify, my idea is to create a script that will get the table names into a cursor, get the field names for that table into a different cursor, then compare temp and prod values. If different, I will update the prod table.
The table 'TableManifest' has only 1 column called TableName.
So 2 questions
1 - Is this an efficient approach to the problem? I'd love to get other suggestions.
2 - This code fails with an error "Must declare the table variable #Temp_TableName" on the line that starts with 'IF(SELECT..'.
DECLARE #TableName varchar(MAX) -- TARGET Table to update
DECLARE #Temp_TableName varchar(MAX) -- SOURCE Table for compare
DECLARE #ColumnNames varchar(MAX) -- Table Column to compare
-- Create CURSOR of Tables to process
DECLARE C_TableNames CURSOR FOR SELECT TableName FROM TableManifest
OPEN C_TableNames
-- Get 1st table name
FETCH NEXT FROM C_TableNames INTO #TableName
WHILE ##FETCH_STATUS = 0
BEGIN
-- Create variable for TEMP table name
SELECT #Temp_TableName = CONCAT('TEMP_',#TableName)
-- Create CURSOR of Column Names in the table
DECLARE C_ColumnNames CURSOR FOR SELECT name FROM sys.columns WHERE object_id = OBJECT_ID(#TableName)
OPEN C_ColumNames
-- Get 1st column name
FETCH NEXT FROM C_ColumnNames INTO #ColumnNames
WHILE ##FETCH_STATUS = 0
BEGIN
-- If the column name is not 'ID' and the values are different, update the TARGET table
IF(SELECT #ColumnNames FROM #Temp_TableName) <> (SELECT #ColumnNames FROM #TableName) AND #ColumnNames <> 'id'
UPDATE #TableName SET #ColumnNames = (SELECT #ColumnNames from #Temp_TableName)
-- Get next column name
FETCH NEXT FROM C_ColumnNames INTO #ColumnNames
END
-- Get next table name
FETCH NEXT FROM C_TableNames INTO #TableName
END
-- Clean up
CLOSE C_ColumNames
DEALLOCATE C_ColumNames
CLOSE C_TableNames
DEALLOCATE C_TableNames

What is the benefit of iterating updates using cursors in SQL? [duplicate]

I want to use a database cursor; first I need to understand what its use and syntax are, and in which scenario we can use this in stored procedures? Are there different syntaxes for different versions of SQL Server?
When is it necessary to use?

Cursors are a mechanism to explicitly enumerate through the rows of a result set, rather than retrieving it as such.
However, while they may be more comfortable to use for programmers accustomed to writing While Not RS.EOF Do ..., they are typically a thing to be avoided within SQL Server stored procedures if at all possible -- if you can write a query without the use of cursors, you give the optimizer a much better chance to find a fast way to implement it.
In all honesty, I've never found a realistic use case for a cursor that couldn't be avoided, with the exception of a few administrative tasks such as looping over all indexes in the catalog and rebuilding them. I suppose they might have some uses in report generation or mail merges, but it's probably more efficient to do the cursor-like work in an application that talks to the database, letting the database engine do what it does best -- set manipulation.

cursor are used because in sub query we can fetch record row by row
so we use cursor to fetch records
Example of cursor:
DECLARE #eName varchar(50), #job varchar(50)
DECLARE MynewCursor CURSOR -- Declare cursor name
FOR
Select eName, job FROM emp where deptno =10
OPEN MynewCursor -- open the cursor
FETCH NEXT FROM MynewCursor
INTO #eName, #job
PRINT #eName + ' ' + #job -- print the name
WHILE ##FETCH_STATUS = 0
BEGIN
FETCH NEXT FROM MynewCursor
INTO #ename, #job
PRINT #eName +' ' + #job -- print the name
END
CLOSE MynewCursor
DEALLOCATE MynewCursor
OUTPUT:
ROHIT PRG
jayesh PRG
Rocky prg
Rocky prg

Cursor might used for retrieving data row by row basis.its act like a looping statement(ie while or for loop).
To use cursors in SQL procedures, you need to do the following:
1.Declare a cursor that defines a result set.
2.Open the cursor to establish the result set.
3.Fetch the data into local variables as needed from the cursor, one row at a time.
4.Close the cursor when done.
for ex:
declare #tab table
(
Game varchar(15),
Rollno varchar(15)
)
insert into #tab values('Cricket','R11')
insert into #tab values('VollyBall','R12')
declare #game varchar(20)
declare #Rollno varchar(20)
declare cur2 cursor for select game,rollno from #tab
open cur2
fetch next from cur2 into #game,#rollno
WHILE ##FETCH_STATUS = 0
begin
print #game
print #rollno
FETCH NEXT FROM cur2 into #game,#rollno
end
close cur2
deallocate cur2

Cursor itself is an iterator (like WHILE). By saying iterator I mean a way to traverse the record set (aka a set of selected data rows) and do operations on it while traversing. Operations could be INSERT or DELETE for example. Hence you can use it for data retrieval for example. Cursor works with the rows of the result set sequentially - row by row. A cursor can be viewed as a pointer to one row in a set of rows and can only reference one row at a time, but can move to other rows of the result set as needed.
This link can has a clear explanation of its syntax and contains additional information plus examples.
Cursors can be used in Sprocs too. They are a shortcut that allow you to use one query to do a task instead of several queries. However, cursors recognize scope and are considered undefined out of the scope of the sproc and their operations execute within a single procedure. A stored procedure cannot open, fetch, or close a cursor that was not declared in the procedure.

I would argue you might want to use a cursor when you want to do comparisons of characteristics that are on different rows of the return set, or if you want to write a different output row format than a standard one in certain cases. Two examples come to mind:
One was in a college where each add and drop of a class had its own row in the table. It might have been bad design but you needed to compare across rows to know how many add and drop rows you had in order to determine whether the person was in the class or not. I can't think of a straight forward way to do that with only sql.
Another example is writing a journal total line for GL journals. You get an arbitrary number of debits and credits in your journal, you have many journals in your rowset return, and you want to write a journal total line every time you finish a journal to post it into a General Ledger. With a cursor you could tell when you left one journal and started another and have accumulators for your debits and credits and write a journal total line (or table insert) that was different than the debit/credit line.

CREATE PROCEDURE [dbo].[SP_Data_newUsingCursor]
(
#SCode NVARCHAR(MAX)=NULL,
#Month INT=NULL,
#Year INT=NULL,
#Msg NVARCHAR(MAX)=null OUTPUT
)
AS
BEGIN
DECLARE #SEPERATOR as VARCHAR(1)
DECLARE #SP INT
DECLARE #VALUE VARCHAR(MAX)
SET #SEPERATOR = ','
CREATE TABLE #TempSiteCode (id int NOT NULL)
WHILE PATINDEX('%' + #SEPERATOR + '%', #SCode ) <> 0
BEGIN
SELECT #SP = PATINDEX('%' + #SEPERATOR + '%' ,#SCode)
SELECT #VALUE = LEFT(#SCode , #SP - 1)
SELECT #SCode = STUFF(#SCode, 1, #SP, '')
INSERT INTO #TempSiteCode (id) VALUES (#VALUE)
END
DECLARE
#EmpCode bigint=null,
#EmpName nvarchar(50)=null
CREATE TABLE #TempEmpDetail
(
EmpCode bigint
)
CREATE TABLE #TempFinalDetail
(
EmpCode bigint,
EmpName nvarchar(500)
)
DECLARE #TempSCursor CURSOR
DECLARE #TempFinalCursor CURSOR
INSERT INTO #TempEmpDetail
(
EmpCode
)
(
SELECT DISTINCT EmpCode FRom tbl_Att_MSCode
WHERE tbl_Att_MSCode.SiteCode IN (SELECT id FROM #TempSiteCode)
AND fldMonth=#Month AND fldYear=#Year
)
SET #TempSiteFinalCursor=CURSOR FOR SELECT EmpCode FROM #TempEmpDetail
OPEN #TempSiteFinalCursor
FETCH NEXT FROM #TempSiteFinalCursor INTO #EmpCode,#SiteCode,#HrdCompanyId
WHILE ##FETCH_STATUS=0
BEGIN
SEt #EmpName=(SELECt EmpName FROm tbl_Employees WHERE EmpCode=#EmpCode)
INSERT INTO #TempFinalDetail
(
EmpCode,
EmpName
)
VALUES
(
#EmpCode,
#EmpName
)
FETCH NEXT FROM #TempSiteFinalCursor INTO #EmpCode
END
SELECT EmpCode,
EmpName
FROM #TempFinalDetail
DEALLOCATE #TempSiteFinalCursor
DROP TABLE #TempEmpDetail
DROP TABLE #TempFinalDetail
END

Function to select rows from multiple tables based on conditions from different tables

Can anyone please help with this query?
I’m using SQL server 2008 . Objective is to select rows from multiple tables based on condition and values from different tables .
I have table1, table2, tableN with columns as ID,ColumnName,ColumnValue . These are the table I need to select rows based on conditions from below table
Control table with columns Number,Function and Enable
Repository table with columns Function and tableName
I need pass Number and ID as parameters and get details of all Function values from Control table which has Enable value = 1 and by using these Function values collect tableNames from Repository table . And for each tableName returned from Repository table get all rows by using ID value.

The way I understand it you have two tables with schema like this:
table Control (Number int, Function nvarchar, Enable bit)
table Repository (Function nvarchar, TableName nvarchar)
Control and Repositories are related via Function column.
You also have a number of other tables and names of those tables are saved in Repositories tables. All those tables have ID column.
You want to get those table names based on a number and then select from all those tables by their ID column.
If that indeed is what you are trying to do, code bellow should be enough to solve your problem.
declare
-- arguments
#id int = 123,
#number int = 123456,
-- helper variables we'll use along the way
#function nvarchar(4000),
#tableName nvarchar(256),
#query nvarchar(4000)
-- create cursor to iterate over every returned row one by one
declare cursor #tables readonly fast_forward
for
select
c.Function,
r.TableName
from [Control] as c
join [Repository] as r on r.Function = c.Function
where c.Number = #number
and c.Enable = 1
-- initialise cursor
open #tables
-- get first row into variables
fetch next from #tables
into #function, #tableName
-- will be 0 as long as fetch next returns new values
while ##fetch_status = 0
begin
-- build a dynamic query
set #query = 'select * from ' + #tableName + ' where ID = ' + #id
-- execute dynamic query. you might get permission problems
-- dynamic queries are best to avoid, but I don't think there's another solution for this
exec(#query)
-- get next row
fetch next from #tables
into #function, #tableName
end
-- destroy cursor
close #tables
deallocate #tables

process each row in table in stored procedure using cursor

My Scenario is bit different. what i am doing in my stored procedure is
Create Temp Table and insert rows it in using "Cursor"
Create Table #_tempRawFeed
(
Code Int Identity,
RawFeed VarChar(Max)
)
Insert Data in temp table using cursor
Set #GetATM = Cursor Local Forward_Only Static For
Select DeviceCode,ReceivedOn
From RawStatusFeed
Where C1BL=1 AND Processed=0
Order By ReceivedOn Desc
Open #GetATM
Fetch Next
From #GetATM Into #ATM_ID,#Received_On
While ##FETCH_STATUS = 0
Begin
Set #Raw_Feed=#ATM_ID+' '+Convert(VarChar,#Received_On,121)+' '+'002333'+' '+#ATM_ID+' : Bills - Cassette Type 1 - LOW '
Insert Into #_tempRawFeed(RawFeed) Values(#Raw_Feed)
Fetch Next
From #GetATM Into #ATM_ID,#Received_On
End
Now have to process each row in Temp Table using another Cursor
DECLARE #RawFeed VarChar(Max)
DECLARE Push_Data CURSOR FORWARD_ONLY LOCAL STATIC
FOR SELECT RawFeed
FROM #_tempRawFeed
OPEN Push_Data
FETCH NEXT FROM Push_Data INTO #RawFeed
WHILE ##FETCH_STATUS = 0
BEGIN
/*
What Should i write here to retrieve each row one at a time ??
One Row should get stored in Variable..in next iteration previous value should get deleted.
*/
FETCH NEXT FROM Push_Data INTO #RawFeed
END
CLOSE Push_Data
DEALLOCATE Push_Data
Drop Table #_tempRawFeed
What Should i write In BEGIN to retrieve each row one at a time ??
One Row should get stored in Variable..in next iteration previous value should get deleted.

Regarding your last question, if what you are really intending to do within your last cursor is to concatenate RawFeed column values into one variable, you don't need cursors at all. You can use the following (adapted from your SQL Fiddle code):
CREATE TABLE #_tempRawFeed
(
Code Int IDENTITY
RawFeed VarChar(MAX)
)
INSERT INTO #_tempRawFeed(RawFeed) VALUES('SAGAR')
INSERT INTO #_tempRawFeed(RawFeed) VALUES('Nikhil')
INSERT INTO #_tempRawFeed(RawFeed) VALUES('Deepali')
DECLARE #RawFeed VarChar(MAX)
SELECT #RawFeed = COALESCE(#RawFeed + ', ', '') + ISNULL(RawFeed, '')
FROM #_tempRawFeed
SELECT #RawFeed
DROP TABLE #_tempRawFeed
More on concatenating different row values into a single string here: Concatenate many rows into a single text string?
I am pretty sure that you can avoid using the first cursor as well. Please, avoid using cursors, since the really hurt performance. The same result can be achieved using set based operations.

Dynamic cursor used in a block in TSQL?

I have the following TSQL codes:
-- 1. define a cursor
DECLARE c_Temp CURSOR FOR
SELECT name FROM employees;
DECLARE #name varchar(100);
-- 2. open it
OPEN c_Temp;
-- 3. first fetch
FETCH NEXT FROM c_Temp INTO #name;
WHILE ##FETCH_STATUS = 0
BEGIN
print #name;
FETCH NEXT FROM c_Temp INTO #name; -- fetch again in a loop
END
-- 4. close it
....
I use the name value only in a loop block. Here I have to
define a cursor variable,
open it,
fetch twice and
close it.
In PL/SQL, the loop can be like this:
FOR rRec IN (SELECT name FROM employees) LOOP
DBMS_OUTPUT.put_line(rRec.name);
END LOOP;
It is much simpler than my TSQL codes. No need to define a cursor. It is created dynamically which is accessible within a loop block (much like C# for loop). Not sure if there something similar like this in TSQL?

Something along these lines might work for you, although it depends on having an ID column or some other unique identifier
Declare #au_id Varchar(20)
Select #au_id = Min(au_id) from authors
While #au_id IS NOT NULL
Begin
Select au_id, au_lname, au_fname from authors Where au_id = #au_id
Select #au_id = min(au_id) from authors where au_id > #au_id
End

Cursors are evil in Sql Server as they can really degrade performance - my favoured approach is to use a Table Variable (>= Sql Server 2005) with an auto inc ID column:
Declare #LoopTable as table (
ID int identity(1,1),
column1 varchar(10),
column2 datetime
)
insert into #LoopTable (column1, column2)
select name, startdate from employees
declare #count int
declare #max int
select #max = max(ID) from #LoopTable
select #count = 1
while #count <= #max
begin
--do something here using row number '#count' from #looptable
set #count = #count + 1
end
It looks pretty long winded however works in any situation and should be far more lightweight than a cursor

Since you are coming from an Oracle background where cursors are used frequently, you may not be aware that in SQl Server cursors are performance killers. Depending on what you are actually doing (surely not just printing the variable), there may be a much faster set-based solution.

In some cases, its also possible to use trick like this one:
DECLARE #name VARCHAR(MAX)
SELECT #name = ISNULL(#name + CHAR(13) + CHAR(10), '') + name
FROM employees
PRINT #name
For a list of employee names.
It can also be used to make comma-separated string, just replace + CHAR(13) + CHAR(10) with + ', '

Why not simply just return the recordset using a select statement. I assume the object is to copy and paste the values in the UI (based on the fact that you are simply printing the output)? In Management studio you can copy and paste from the grid, or press +T and then run the query and return the results as part of the messages tab in plain text.
If you were to run this via an application, the application wouldn't be able to access the printed statements as they are not being returned within a recordset.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

removing duplicates from table without using temporary table - sql

I've a table(TableA) with contents like this: Col1 ----- A B B B C C D i want to remove just the duplicate values without using temporary table in Microsoft SQL Server. can anyone help me? the final table should look like this: Col1 ----- A B C D thanks :)

WITH TableWithKey AS ( SELECT ROW_NUMBER() OVER (ORDER BY Col1) As id, Col1 As val FROM TableA ) DELETE FROM TableWithKey WHERE id NOT IN ( SELECT MIN(id) FROM TableWithKey GROUP BY val )

Can you use the row_number() function (http://msdn.microsoft.com/en-us/library/ms186734.aspx) to partition by the columns you're looking for dupes on, and delete where row number isn't 1?

Related

T-SQL script to update numerous prod tables from temp tables

What is the benefit of iterating updates using cursors in SQL? [duplicate]

Function to select rows from multiple tables based on conditions from different tables

process each row in table in stored procedure using cursor

Dynamic cursor used in a block in TSQL?

Categories

Resources