SQL - Delete duplicate columns error [duplicate] - sql

This question already has answers here:
How to delete duplicate rows in SQL Server?
(26 answers)
Closed 4 years ago.
I have the following table (TBL_VIDEO) with duplicate column entries in "TIMESTAMP", and I want to remove them only if the "CAMERA" number matches.
BEFORE:
ANALYSIS_ID | TIMESTAMP | EMOTION | CAMERA
-------------------------------------------
1 | 5 | HAPPY | 1
2 | 10 | SAD | 1
3 | 10 | SAD | 1
4 | 5 | HAPPY | 2
5 | 15 | ANGRY | 2
6 | 15 | HAPPY | 2
AFTER:
ANALYSIS_ID | TIMESTAMP | EMOTION | CAMERA
-------------------------------------------
1 | 5 | HAPPY | 1
2 | 10 | SAD | 1
4 | 5 | HAPPY | 2
5 | 15 | ANGRY | 2
I have attempted this statement but the columns wouldn't delete accordingly. I appreciate all the help to produce a correct SQL statement. Thanks in advance!
delete y
from TBL_VIDEO y
where exists (select 1 from TBL_VIDEO y2 where y.TIMESTAMP = y2.TIMESTAMP and y2.CAMERA < y.CAMERA);

CREATE TABLE Table12
([ANALYSIS_ID] int, [TIMESTAMP] int, [EMOTION] varchar(5))
;
INSERT INTO Table12
([ANALYSIS_ID], [TIMESTAMP], [EMOTION])
VALUES
(1, 5, 'HAPPY'),
(2, 10, 'SAD'),
(3, 10, 'SAD'),
(4, 15, 'HAPPY'),
(5, 15, 'ANGRY')
;
with cte as (select *, row_number() over (partition by emotion order by [ANALYSIS_ID] ) as rn from Table12)
delete from cte
where rn>1
select * from Table12
output
ANALYSIS_ID TIMESTAMP EMOTION
1 5 HAPPY
2 10 SAD
5 15 ANGRY

You have two questions:
what is wrong with my code
is there a better way to delete the duplicate column entries
For the second question, it's a dup.
For the first question, please refer https://learn.microsoft.com/en-us/sql/t-sql/statements/delete-transact-sql?view=sql-server-2017. (Press F1 on delete). Correct syntax is
delete y
from Table12 y
where exists (

Generic SQL command as below. you can put you column name/ condition and table name.
DELETE T from
(
SELECT ROW_NUMBER()over(partition by column1 order by column2)a,* FROM TABLENAME
)T
where a>1

delete
from TBL_VIDEO y
where y.CAMERA < (select y2.CAMERA
from TBL_VIDEO y2 where
y.TIMESTAMP = y2.TIMESTAMP );

Related

SQL (Omnisci) get common and uncommon values of a column

I'm using Omnisci to join two tables and I need the following:
Table 1:
poly_id | num_competitors
1 | 1
2 | 1
3 | 5
Table 2:
poly_id | num_stores
1 | 1
5 | 3
7 | 5
What I want:
poly_id | num_competitors | num_stores
1 | 1 | 1
2 | 1 | 0
3 | 5 | 0
5 | 0 | 3
7 | 0 | 5
I know in normal SQL you can do it with FULL JOIN or even with UNION, but Omnisci does not support any of these functions yet (it does support JOIN and LEFT JOIN though).
I've found a way to solve it. It's by creating a new empty table. Insert into it Table 1 and Table 2 and then make a group by on poly id in order to merge rows that have both num_competitors and num_stores.
CREATE TABLE competitors_stores ( poly_id integer, num_stores integer, num_competitors integer);
INSERT INTO competitors_stores ( SELECT poly_id, 0, num_competitors from competitors_geo)
INSERT INTO competitors_stores ( SELECT poly_id, num_stores, 0 from telepi_stores_geo)
CREATE TABLE num_competitors_stores AS (select poly_id, SUM(num_stores) AS num_stores, SUM(num_competitors) as num_competitors from competitors_stores group by poly_id);
DROP TABLE telepi_competitors_stores;
Anyway, I'm still open to hearing alternatives since I feel like this is not the best way to solve it.

SQL - Deleting duplicate columns only if another column matches [duplicate]

This question already has answers here:
SQL - Delete duplicate columns error [duplicate]
(4 answers)
How to delete duplicate rows in SQL Server?
(26 answers)
Closed 4 years ago.
I have the following table (TBL_VIDEO) with duplicate column entries in "TIMESTAMP", and I want to remove them only if the "CAMERA" number matches.
BEFORE:
ANALYSIS_ID | TIMESTAMP | EMOTION | CAMERA
-------------------------------------------
1 | 5 | HAPPY | 1
2 | 10 | SAD | 1
3 | 10 | SAD | 1
4 | 5 | HAPPY | 2
5 | 15 | ANGRY | 2
6 | 15 | HAPPY | 2
AFTER:
ANALYSIS_ID | TIMESTAMP | EMOTION | CAMERA
-------------------------------------------
1 | 5 | HAPPY | 1
2 | 10 | SAD | 1
4 | 5 | HAPPY | 2
5 | 15 | ANGRY | 2
I have attempted this statement but the columns wouldn't delete accordingly. I appreciate all the help to produce a correct SQL statement. Thanks in advance!
delete y
from TBL_VIDEO y
where exists (select 1 from TBL_VIDEO y2 where y.TIMESTAMP = y2.TIMESTAMP and y2.ANALYSIS_ID < y.ANALYSIS_ID, y.CAMERA = y.CAMERA, y2.CAMERA = y2.CAMERA);
try this:
delete f2 from (
select row_number() over(partition by TIMESTAMP, CAMERA order by ANALYSIS_ID) rang
from yourtable f1
) f2 where f2.rang>1
Other solution :
delete f1 from yourtable f1
where exists
(
select * from yourtable f2
where f2.TIMESTAMP=f1.TIMESTAMP and f2.CAMERA=f1.CAMERA and f1.ANALYSIS_ID>f2.ANALYSIS_ID
)
use row_number and find the duplicate and delete them
delete from
(select *,row_number() over(partition by TIMESTAMP,CAMERA order by ANALYSIS_ID) as rn from TBL_VIDEO
) t1 where rn>1
;WITH cte
AS
(
select ANALYSIS_ID,
ROW_NUMBER() over(partition by TIMESTAMP, CAMERA order by ANALYSIS_ID) rnk
)
DELETE FROM cte WHERE cte.rnk > 1
You can use subquery :
select v.*
from tbl_video v
where analysis_id = (select min(v1.analysis_id)
from tbl_video v1
where v1.timestamp = v.timestamp and
v1.camera = v.camera
);
However, analytical function with top (1) with ties clause also useful :
select top (1) with ties v.*
from tbl_video v
order by row_number() over (partition by v.timestamp, v.camera order by v.analysis_id);
So, your delete version would be :
delete v
from tbl_video v
where analysis_id = (select min(v1.analysis_id)
from tbl_video v1
where v1.timestamp = v.timestamp and
v1.camera = v.camera
);

How to delete duplicate rows based on one column in postgreSQL?

Say I have column A, B and Date, and I want all rows which are duplicated in A to be removed, while keeping the one with the most recent Date. How would I do this?
I have looked at many other solutions but none seem to work for my case.
Thanks in advance for any help
This should work for you:
DELETE FROM YourTable USING
(SELECT colA, MAX(Date) maxDate
FROM YourTable
GROUP BY colA
) AS Keep
WHERE Keep.maxDate <> YourTable.Date
AND Keep.ColA = YourTable.ColA
will stay:
t=# with sample(a,b,dat) as (values(1,1,1),(1,1,2),(1,2,3),(2,2,3),(2,2,4))
, comparison as (select *,max(dat) over (partition by a) from sample)
select *
from comparison
where dat = max;
a | b | dat | max
---+---+-----+-----
1 | 2 | 3 | 3
2 | 2 | 4 | 4
(2 rows)
and thus to be deleted:
t=# with sample(a,b,dat) as (values(1,1,1),(1,1,2),(1,2,3),(2,2,3),(2,2,4))
, comparison as (select *,max(dat) over (partition by a) from sample)
delete
from comparison
where dat <> max
returning *;
a | b | dat | max
---+---+-----+-----
1 | 1 | 1 | 3
1 | 1 | 2 | 3
2 | 2 | 3 | 4
(3 rows)
of course instead of comparison you should name your table

Update existing records based on the order from a different column

I have the following table:
X_ID X_NAME X_TYPE X_SORT_ID
10 BOOK 1 NULL
20 PEN 1 NULL
30 WATCH 2 NULL
5 TENT 3 NULL
What I'm trying to achieve is to populate the X_SORT_ID column with incremented values starting with 1 based on value in X_ID.
So the table would look like this:
X_ID X_NAME X_TYPE X_SORT_ID
10 BOOK 1 2
20 PEN 1 3
30 WATCH 2 4
5 TENT 3 1
I need to update this table only for all existing rows.
The records that will be added in the future will use a sequence that would set the X_SORT_ID field to the next value.
The only query I came up with is not exactly what I need.
UPDATE X_ITEMS
SET X_SORT_ID = (SELECT MAX(X_ID) FROM X_ITEMS) + ROWNUM
WHERE X_SORT_ID IS NULL;
I could use just a rownum, but this would assign value of 4 to the last record with X_ID = 5, which is not what I wanted.
I'd be thankful for any suggestions.
Can use oracle row_number :
update query
update items ot
set X_SORT_ID =
(
select rw from
(
select X_ID, row_number() over ( order by X_ID ) as rw from items
) it
where it.X_ID = ot.X_ID
)
;
result table
+------+--------+--------+-----------+
| X_ID | X_NAME | X_TYPE | X_SORT_ID |
+------+--------+--------+-----------+
| 10 | BOOK | 1 | 2 |
| 20 | PEN | 1 | 3 |
| 30 | WATCH | 2 | 4 |
| 5 | TENT | 3 | 1 |
+------+--------+--------+-----------+
sqlfiddle
Using ROWNUM (a pseudocolumn) instead of ROWNUMBER(an analytic function) as used above.
Read here for difference
X_ID should be defined as primary key.
update Grentley GY
set X_SORT_ID =
(select rno from
(select X_ID,rownum as rno from Grentley GY
order by x_id ) AB
where AB.X_ID= GY.X_ID
) ;
SQL Fiddle
Sample

Help with optimising SQL query

Hi i need some help with this problem.
I am working web application and for database i am using sqlite. Can someone help me with one query from databse which must be optimized == fast =)
I have table x:
ID | ID_DISH | ID_INGREDIENT
1 | 1 | 2
2 | 1 | 3
3 | 1 | 8
4 | 1 | 12
5 | 2 | 13
6 | 2 | 5
7 | 2 | 3
8 | 3 | 5
9 | 3 | 8
10| 3 | 2
....
ID_DISH is id of different dishes, ID_INGREDIENT is ingredient which dish is made of:
so in my case dish with id 1 is made with ingredients with ids 2,3
In this table a have more then 15000 rows and my question is:
i need query which will fetch rows where i can find ids of dishes ordered by count of ingreedients ASC which i haven added to my algoritem.
examle: foo(2,4)
will rows in this order:
ID_DISH | count(stillMissing)
10 | 2
1 | 3
Dish with id 10 has ingredients with id 2 and 4 and hasn't got 2 more, then is
My query is:
SELECT
t2.ID_dish,
(SELECT COUNT(*) as c FROM dishIngredient as t1
WHERE t1.ID_ingredient NOT IN (2,4)
AND t1.ID_dish = t2.ID_dish
GROUP BY ID_dish) as c
FROM dishIngredient as t2
WHERE t2.ID_ingredient IN (2,4)
GROUP BY t2.ID_dish
ORDER BY c ASC
works,but it is slow....
select ID_DISH, sum(ID_INGREDIENT not in (2, 4)) stillMissing
from x
group by ID_DISH
having stillMissing != count(*)
order by stillMissing
this is the solution, my previous query work 5 - 20s this work about 80ms
This is from memory, as I don't know the SQL dialect of sqlite.
SELECT DISTINCT T1.ID_DISH, COUNT(T1.ID_INGREDIENT) as COUNT
FROM dishIngredient as T1 LEFT JOIN dishIngredient as T2
ON T1.ID_DISH = T2.ID_DISH
WHERE T2.ID_INGREDIENT IN (2,4)
GROUP BY T1.ID_DISH
ORDER BY T1.ID_DISH