SQL - Deleting duplicate rows without leaving original

SQL - Deleting duplicate rows without leaving original - sql

How to delete duplicate rows based on select columns without leaving original? In this example, deleting based on Name and Animal.
ID Name Animal Fruit
1 Bob Dog Orange
2 Adam Dog Orange
3 Bob Dog Apple
4 Adam Cat Orange
5 Bob Cat Apple
6 Bob Hamster Apple
7 Adam Cat Apple
So the expected result would be:
ID Name Animal Fruit
2 Adam Dog Orange
5 Bob Cat Apple
6 Bob Hamster Apple

You can use a delete with join the subquery grouped by name and animal having count > 1
delete m
from my_table m
inner join (
select name, animal
from my_table
group by name, animal
having count(*) > 1
) t on t.name = m.name
and t.animal = m.animal

try this:
first, select the duplicates in a subquery.
then, delete all results
delete from mytable T
left join
(select count(*) cnt, Name, Animal
from mytable
group by Name, Animal) X
on t.Name = X.Name
and t.Animal = X.Animal
where cnt>1

I would do this using exists:
delete from t
where exists (select 1
from t t2
where t2.name = t.name and t2.animal = t.animal and t2.id <> t.id
);

Related

Select from multiple tables where 1 record has more than 1 match in table 2

I have 2 tables, for example one with a person ID, name and food ID for an order and the second with the food ID and food name. I want to join these and return the ID, name, Food ID and Food Name but only for instances where the count of IDs and Food names are > 1 like below. Unfortunately when I try to do this I either get NULL instances from ID or it pulls the Food IDs I'm trying to exclude
Person
ID
Name
Food_ID
1
Joe
3
2
Jill
2
3
Jack
1
1
Joe
1
2
Jill
3
3
Jack
3
1
Joe
4
2
Jill
4
3
Jack
4
Food
Food ID
Food
1
Meat - Fish
2
Veg - Potato
3
Meat - Chicken
4
Veg - Broccoli
ID
Name
Food_ID
Food
1
Joe
3
Meat - Chicken
1
Joe
1
Meat - Fish
3
Jill
1
Meat - Fish
3
Jill
3
Meat - Chicken
I can do it using a temp table to get count of IDs where food like '%Meat%' and count (p.ID) > 1 but I need it to run in just a select query and I've no ID how to approach it as including a where exists just returns me NULL IDs. Apologies for how bad my SQL is but I haven't used it in years and am used to doing all my aggregation in Excel so have little idea how I'm meant to approach it, it's probably a really simple solution
SELECT p.ID, p.Name, f.Food_ID, f.Name
FROM Person p
LEFT JOIN Food f ON p.Food_ID = f.Food_ID
WHERE EXISTS (
SELECT COUNT(p.ID), COUNT(f.Food_ID)
FROM Person p
LEFT JOIN Food f ON p.Food_ID = f.Food_ID
WHERE f.Food LIKE '%Meat%'
GROUP BY p.ID
HAVING COUNT(p.id) > 1
)
GROUP BY p.ID

Try this:
SELECT
*
FROM
Person
JOIN Food ON Food.Food_ID = Person.Food_ID
WHERE
Person.ID IN (
SELECT
Person.ID
FROM
Person
JOIN Food ON Food.Food_ID = Person.Food_ID
WHERE
Food.Food LIKE '%meat%'
GROUP BY
Person.ID
HAVING
COUNT(*) > 1
)
ORDER BY
Person.ID
;

You can use a CTE to get GROUP BY p.ID HAVING COUNT(*) > 1, then get the other column values you need in your main query with a JOIN to your CTE using the p.ID column.
WITH cte AS (
SELECT
p.ID
FROM Person p
LEFT JOIN Food f ON p.Food_ID = f.Food_ID
WHERE f.Food LIKE '%Meat%'
GROUP BY p.ID
HAVING COUNT(*) > 1)
SELECT
p.ID,
p.Name,
f.Food_ID,
f.Food
FROM Person p
LEFT JOIN Food f ON p.Food_ID = f.Food_ID
INNER JOIN cte ON p.ID = cte.ID
WHERE f.Food LIKE '%Meat%'
ORDER BY p.ID ASC
Fiddle here.
Result:
ID
Name
Food_ID
Food
1
Joe
3
Meat - Chicken
1
Joe
1
Meat - Fish
3
Jack
1
Meat - Fish
3
Jack
3
Meat - Chicken
Note: Based on your provided data, I believe Jack should be listed in your result set, not Jill.

Return something based on the number of time it appears in SQL

id cars price
1 bmw
1 corvette
1 mercedes
2 bmw
3 bmw
3 toyota
4 bmw
4 honda
5 lotus
I found this table from another post and just wanted to use it for my question.
Suppose the ids represent owners and they own multiple cars and some owners have the same car.
I want to write a query such that given a number n and an owner(id)
I can return the cars that the owner has and there are a total of n of these cars in the table.
For example if
I'm given id 1 and n = 4 then it will return
bmw
if I'm given id 1 and n = 1 then it will return
corvette
mercedes
I figured out that
select cars from table group by cars having count(cars) = 4
Gives me all the cars that appear 4 times in the table but I want to narrow it down to a car that is owned by a certain car owner.
Thanks for helping

Method 3 :
select * from
(
select f1.*, rownumber() over(partition by f1.car) rang
from yourtable f1
) f2
where f2.rang=4 and f1.id=1

Method 1 :
with totalcar
(
select car, count(*) nb
from yourtable
group by car
)
select * from yourtable f1 inner join totalcar f2 on f1.car=f2.car
where f1.id=1 and f2.nb=4

Method 2 :
select * from yourtable f1
inner join lateral
(
select f2.car
from yourtable f2
where f1.car=f2.car
group by f2.car
having count(*)=4
) f3 on 1=1
where f1.id=1

Oracle SQL query joining same table

I have a table like this:
items
id old_new object
1 o pen
2 n house
3 o dog
4 o cat
5 n carrot
I would like the select return:
id new_object old_object
1 null pen
2 house null
3 null dog
4 null cat
5 carrot null
Do I need to use an outer join on the same table?

No join needed:
select id,
case when old_new = 'n' then object end as new_object,
case when old_new = 'o' then object end as old_object
from the_table
order by id;

How can I avoiding Cartesian product on SQL on multiple tables

Here is my sqlfiddle http://sqlfiddle.com/#!3/671c8/1.
Here are my tables:
Person
PID LNAME FNAME
1 Bob Joe
2 Smith John
3 Johnson Jake
4 Doe Jane
Table1
PID VALUE
1 3
1 5
1 35
2 10
2 15
3 8
Table2
PID VALUE
1 X1
1 X2
1 X3
2 Z1
3 X3
I am trying to join several tables on a person's ID. These tables contain events with dates, but the dates may or may not match across table. So what I really want it to regardless of date join the tables in a way such that when I get results the table with the largest rows will be the amount of rows in my result and all other tables will "fit" within. For example
Instead of this which is a cartesian product:
PID LNAME FNAME THINGONE THINGTWO
1 Bob Joe 3 X1
1 Bob Joe 3 X2
1 Bob Joe 3 X3
1 Bob Joe 5 X1
1 Bob Joe 5 X2
1 Bob Joe 5 X3
1 Bob Joe 35 X1
1 Bob Joe 35 X2
1 Bob Joe 35 X3
I would like something like this:
PID LNAME FNAME THINGONE THINGTWO
1 Bob Joe 3 X1
1 Bob Joe 5 X2
1 Bob Joe 35 X3
My sql statement:
SELECT
p.*,
t1.value as thingone,
t2.value as thingtwo
FROM
person p
left outer join table1 t1 on p.pid=t1.pid
left outer join table2 t2 on p.pid=t2.pid
;

I can't fathom why you want to do this, but...
You need to create an artificial join between table1 and table2, and then link that to the master table. One way of doing that is by ranking the rows in order. eg:
SELECT
p.pid, p.lname,p.fname, thingone, thingtwo
FROM
person p
left outer join
(
select ISNULL(t1.pid, t2.pid) as pid, t1.value as thingone, t2.value as thingtwo
from
(select *, ROW_NUMBER() over (partition by pid order by value) rn
from table1) t1
full outer join
(select *, ROW_NUMBER() over (partition by pid order by value) rn
from table2) t2
on t1.pid=t2.pid and t1.rn=t2.rn
) v
on p.pid = v.pid

This is a trickier problem than I thought. The challenge is being sure that all the records appear, regardless of the lengths of the two lists. The following works by enumerating each of the lists and using that for the join conditions:
SELECT p.*,
t1.value as thingone,
t2.value as thingtwo
FROM person p left outer join
(select t1.*,
row_number() over (partition by pid order by pid) as seqnum,
count(*) over (partition by pid) as cnt
from table1 t1
) t1
on p.pid = t1.pid left outer join
(select t2.*, row_number() over (partition by pid order by pid) as seqnum,
count(*) over (partition by pid) as cnt
from table2 t2
) t2
on p.pid = t2.pid
WHERE t1.seqnum = t2.seqnum or
(t2.seqnum > t1.cnt) or
(t1.seqnum > t2.cnt) or
t1.seqnum is null or
t2.seqnum is null;
Here is a slight modification to your SQL Fiddle that has better test data.
EDIT:
The logic in the where clause handles these cases (in order by the clauses):
Where the two lists have sequence numbers, these must match.
Where list2 is longer and list1 has at least one element.
Where list1 is longer and list2 has at least one element.
Where list1 is empty
Where list 2 is empty
These were arrived at by trial and error, because the original condition did not work:
on p.pid = t2.pid and t1.seqnum = t2.seqnum
This returns NULL values for p.id for the extra elements on the list. Podliuska's approach may also work; I had just started down this path and the where conditions do the trick.

Tricky SQL - Select non-adjacent numbers

Given this data on SQL Server 2005:
SectionID Name
1 Dan
2 Dan
4 Dan
5 Dan
2 Tom
7 Tom
9 Tom
10 Tom
How would I select records where the sectionID must be +-2 or more from another section for the same name.
The result would be:
1 Dan
4 Dan
2 Tom
7 Tom
9 Tom
Thanks for reading!

SELECT *
FROM mytable a
WHERE NOT EXISTS
(SELECT *
FROM mytable b
WHERE a.Name = b.Name
AND a.SectionID = b.SectionID + 1)

Here's LEFT JOIN variant of Anthony's answer (removes consecutive id's from the results)
SELECT a.*
FROM mytable a
LEFT JOIN mytable b ON a.Name = b.Name AND a.SectionID = b.SectionID + 1
WHERE b.SectionID IS NULL
EDIT: Since there is another interpretation of the question (simply getting results where id's are more than 1 number apart) here is another attempt at an answer:
WITH alternate AS (
SELECT sectionid,
name,
EXISTS(SELECT a.sectionid
FROM mytable b
WHERE a.name = b.name AND
(a.sectionid = b.sectionid-1 or a.sectionid = b.sectionid+1)) as has_neighbour,
row_number() OVER (PARTITION by a.name ORDER BY a.name, a.sectionid) as row_no
FROM mytable a
)
SELECT sectionid, name
FROM alternate
WHERE row_no % 2 = 1 OR NOT(has_neighbour)
ORDER BY name, sectionid;
gives:
sectionid | name
-----------+------
1 | Dan
4 | Dan
2 | Tom
7 | Tom
9 | Tom
Logic: if a record has neighbors with same name and id+/-1 then every odd row is taken, if it has no such neighbors then it gets the row regardless if it is even or odd.
As stated in the comment the condition is ambiguous - on start of each new sequence you might start with odd or even rows and the criteria will still be satisfied with different results (even with different number of results).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - Deleting duplicate rows without leaving original - sql

You can use a delete with join the subquery grouped by name and animal having count > 1 delete m from my_table m inner join ( select name, animal from my_table group by name, animal having count(*) > 1 ) t on t.name = m.name and t.animal = m.animal

try this: first, select the duplicates in a subquery. then, delete all results delete from mytable T left join (select count(*) cnt, Name, Animal from mytable group by Name, Animal) X on t.Name = X.Name and t.Animal = X.Animal where cnt>1

I would do this using exists: delete from t where exists (select 1 from t t2 where t2.name = t.name and t2.animal = t.animal and t2.id <> t.id );

Related

Select from multiple tables where 1 record has more than 1 match in table 2

Return something based on the number of time it appears in SQL

Oracle SQL query joining same table

How can I avoiding Cartesian product on SQL on multiple tables

Tricky SQL - Select non-adjacent numbers

Categories

Resources