How to delete the Duplicate rows

How to delete the Duplicate rows - sql

Table1
ID Date
001 23/02/2009
001 24/02/2009
001 24/02/2009
002 25/02/2009
002 25/02/2009
...
I want to delete the duplicate rows from the above table.
Expected Output
ID Date
001 23/02/2009
001 24/02/2009
002 25/02/2009
...
Need Query Help

Can't remember where I got it, but I used to use this SQL to remove duplicates from a table:
begin tran deduplicate
select DISTINCT *
into #temp
from mytable
truncate table mytable
insert mytable
select *
from #temp
select * from mytable
drop table #temp
commit tran deduplicate

If you do google search you will get plenty of help.
E.g.
http://support.microsoft.com/kb/139444
http://blog.sqlauthority.com/2007/03/01/sql-server-delete-duplicate-records-rows/
http://www.sql-server-performance.com/2003/delete-duplicates/

Related

How to delete rows where more than 1 column matches another table?

I have two tables. One (let's call it table1) looks a bit like this:
account_number | offer_code
---------------|-----------
1 | 123
1 | 456
2 | 123
The other table (let's call it table2) looks a bit like this:
account_number | offer_code
---------------|-----------
1 | 123
I want to delete all rows from table1 where the account_number AND the offer_code match a row in table2. So afterwards table1 would look like this:
account_number | offer_code
---------------|-----------
1 | 456
2 | 123
I've tried the following, but it doesn't run:
DELETE
FROM TABLE1 A
INNER JOIN
TABLE2 B
ON A.ACCOUNT_NUMBER = B.ACCOUNT_NUMBER
AND A.OFFER_CODE = B.OFFER_CODE
;
I've also tried the following. It seems to run, but the sheer volume of data in both tables (65.5m rows in table1 and 9m in table2) mean it takes an impractically long time to do so (I was forced to kill the query after 3 hours).
DELETE
FROM TABLE1
WHERE CONCAT(ACCOUNT_NUMBER, OFFER_CODE) IN
(
SELECT CONCAT(ACCOUNT_NUMBER, OFFER_CODE)
FROM TABLE2
)
;
Does anyone know if there is a way to accomplish this efficiently please?

Databases do not like update and delete processes. They are exhausting. Depending on your application(carefully check this out!!!) you can try this:
create table table1_tmp
select * from table1
minus
select * from table2;
alter table table1 rename to table1_tmp2;
alter table table1_tmp rename to table1;

SQL Select Where Opposite Match Does Not Exist

Trying to compare between two columns and check if there are no records that exist with the reversal between those two columns. Other Words looking for instances where 1-> 3 exists but 3->1 does not exist. If 1->2 and 2->1 exists we will still consider 1 to be part of the results.
Table = Betweens
start_id | end_id
1 | 2
2 | 1
1 | 3
1 would be added since it is a start to an end with no opposite present of 3,1. Though it did not get added until the 3rd entry since 1 and 2 had an opposite.
So, eventually it will just return names where the reversal does not exist.
I then want to join another table where the number from the previous problem has its name installed on it.
Table = Names
id | name
1 | Mars
2 | Earth
3 | Jupiter
So results will just be the names of those that don't have an opposite.

You can use a not exists condition:
select t1.start_id, t1.end_id
from the_table t1
where not exists (select *
from the_table t2
where t2.end_id = t1.start_id
and t2.start_id = t1.end_id);

I'm not sure about your data volume, so with your ask, below query will supply desired result for you in Sql Server.
create table TableBetweens
(start_id INT,
end_id INT
)
INSERT INTO TableBetweens VALUES(1,2)
INSERT INTO TableBetweens VALUES(2,1)
INSERT INTO TableBetweens VALUES(1,3)
create table TableNames
(id INT,
NAME VARCHAR(50)
)
INSERT INTO TableNames VALUES(1,'Mars')
INSERT INTO TableNames VALUES(2,'Earth')
INSERT INTO TableNames VALUES(3,'Jupiter')
SELECT *
FROM TableNames c
WHERE c.id IN (
SELECT nameid1.nameid
FROM (SELECT a.start_id, a.end_id
FROM TableBetweens a
LEFT JOIN TableBetweens b
ON CONCAT(a.start_id,a.end_id) = CONCAT(b.end_id,b.start_id)
WHERE b.end_id IS NULL
AND b.start_id IS NULL) filterData
UNPIVOT
(
nameid
FOR id IN (filterData.start_id,filterData.end_id)
) AS nameid1
)

To update data in a column based up on data from 2 other tables

I need to insert value into a column of a table based up on data from other tables.
Example Data :
Table ccdocs
ID index reference Location_id
1 001 ABCD
2 001A EFGH
3 002 NULL
4 003 NULL
Table: cclvig
index reference Location
001 ABCD VMC
001A EFGH VMC_TOP
002 NULL ICF
003 NULL VMC
Table : doc_location
loc_id Lctn
1 VMC
2 VMC_TOP
3 ICF
All records of ccdocs are copied from cclvig through query. Now I have to insert location id into ccdocs depending on value from cclvig column "location". Table doc_location have location id. I tried update query with select statement.. but its returning multiple values.. please help..

UPDATE ccdocs d
SET location_id = loc.loc_id
FROM doc_location loc
JOIN cclvig c ON c.location = loc.lctn
WHERE d.index = c.index;
In an UPDATE you can specify a query where to get new values from. This query is amlost idential to a regular SELECT statement (with restrictions on allowed clauses), but instead of the actual SELECT column list you have the UPDATE table SET column = phrase.

try below query
update ccdocs set Location_id=loc.loc_id
from(
select a.index,b.loc_id from cclvig a inner join doc_location on
a.Location=b.Lctn
)loc
on ccdocs.index=loc.index

Sql to update login id's dynamically based on count

I have 2 tables. One is main table and other one is login table. I may have 10 Records in Main table and 6 Records in login table. Each login id has to be assingned equally to main table. Can any one please give me the best solution to update the login information.
Example
Create table ##t1
(id int identity,
name varchar(5),
loginid varchar(10)
divno char(3))
create table ##l1
(
id int identity,
name varchar(10),divno char(3))
insert into ##t1 values
('Jin',null,'001')
insert into ##t1 values
('Anu',null,'001')
insert into ##t1 values
('kir',null'002')
insert into ##t1 values
('Asi',null,'003')
insert into ##t1 values
('Nil',null,'002')
insert into ##t1 values
('sup',null,'003')
insert into ##t1 values
('amu',null,'003')
insert into ##t1 values
('mani',null,'003')
insert into ##l1 values
('A','001')
insert into ##l1 values
('B','001')
insert into ##l1 values
('C','002')
insert into ##l1 values
('D','002')
insert into ##l1 values
('E','002')
insert into ##l1 values
('F','003')
Data Example
Main table
id name loginid divno
----------- ----- ----------
1 Jin NULL 001
2 Anu NULL 001
3 kir NULL 002
4 Asi NULL 003
5 Nil NULL 002
6 sup NULL 003
7 amu NULL 003
8 mani NULL 003
Login Table
id name divno
----------- -------------
1 A 001
2 B 001
3 C 002
4 D 002
5 E 002
6 F 003
desired output
How can we do this without looping?

update ##t1
set loginid = #l1.name
from
##t1
inner join
(select *, (ROW_NUMBER() Over (order by id) -1)% (select COUNT(*) from ##l1)+1 as rn from ##t1) v
on ##t1.id = v.id
inner join
##l1
on v.rn = ##l1.id

Let me do this as a select query rather than as an upadte.
select id, name, l.login
from (select mt.*,
(row_number() over (order by id) % l.loginCount) + 1 as loginSeqnum
from MainTable mt cross join
(select count(*) as loginCount from login) l
) mt join
(select l.*, row_number() over (order by id) as seqnum
from login l
) l
on mt.LoginSeqnum = l.seqnum
What this is doing is adding a sequence number to the logins (just in case loginid is not 1..n. It then calculates a similar value for each record in the first table.
One nice thing about this method is you can modify it to get more random orderings, by changing the "order by" clause in the row_number() statements. For instance, using "order by newid()" will randomize the assignment, rather than doing it in a round-robin fashion.

Adding Row Numbers To a SELECT Query Result in SQL Server Without use Row_Number() function

i need Add Row Numbers To a SELECT Query without using Row_Number() function.
and without using user defined functions or stored procedures.
Select (obtain the row number) as [Row], field1, field2, fieldn from aTable
UPDATE
i am using SAP B1 DIAPI, to make a query , this system does not allow the use of rownumber() function in the select statement.
Bye.

I'm not sure if this will work for your particular situation or not, but can you execute this query with a stored procedure? If so, you can:
A) Create a temp table with all your normal result columns, plus a Row column as an auto-incremented identity.
B) Select-Insert your original query, sans the row column (SQL will fill this in automatically for you)
C) Select * on the temp table for your result set.
Not the most elegant solution, but will accomplish the row numbering you are wanting.

This query will give you the row_number,
SELECT
(SELECT COUNT(*) FROM #table t2 WHERE t2.field <= t1.field) AS row_number,
field,
otherField
FROM #table t1
but there are some restrictions when you want to use it. You have to have one column in your table (in the example it is field) which is unique and numeric and you can use it as a reference. For example:
DECLARE #table TABLE
(
field INT,
otherField VARCHAR(10)
)
INSERT INTO #table(field,otherField) VALUES (1,'a')
INSERT INTO #table(field,otherField) VALUES (4,'b')
INSERT INTO #table(field,otherField) VALUES (6,'c')
INSERT INTO #table(field,otherField) VALUES (7,'d')
SELECT * FROM #table
returns
field | otherField
------------------
1 | a
4 | b
6 | c
7 | d
and
SELECT
(SELECT COUNT(*) FROM #table t2 WHERE t2.field <= t1.field) AS row_number,
field,
otherField
FROM #table t1
returns
row_number | field | otherField
-------------------------------
1 | 1 | a
2 | 4 | b
3 | 6 | c
4 | 7 | d
This is the solution without functions and stored procedures, but as I said there are the restrictions. But anyway, maybe it is enough for you.

RRUZ, you might be able to hide the use of a function by wrapping your query in a View. It would be transparent to the caller. I don't see any other options, besides the ones already mentioned.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to delete the Duplicate rows - sql

Table1 ID Date 001 23/02/2009 001 24/02/2009 001 24/02/2009 002 25/02/2009 002 25/02/2009 ... I want to delete the duplicate rows from the above table. Expected Output ID Date 001 23/02/2009 001 24/02/2009 002 25/02/2009 ... Need Query Help

Can't remember where I got it, but I used to use this SQL to remove duplicates from a table: begin tran deduplicate select DISTINCT * into #temp from mytable truncate table mytable insert mytable select * from #temp select * from mytable drop table #temp commit tran deduplicate

If you do google search you will get plenty of help. E.g. http://support.microsoft.com/kb/139444 http://blog.sqlauthority.com/2007/03/01/sql-server-delete-duplicate-records-rows/ http://www.sql-server-performance.com/2003/delete-duplicates/

Related

How to delete rows where more than 1 column matches another table?

SQL Select Where Opposite Match Does Not Exist

To update data in a column based up on data from 2 other tables

Sql to update login id's dynamically based on count

Adding Row Numbers To a SELECT Query Result in SQL Server Without use Row_Number() function

Categories

Resources