How to eliminate repeated rows from table with composite primary key ORACLE - sql

I need to find the repeated rows from a table where the first three columns will make up the primary key. Then after finding out which one's are repeated, those repeated rows need to be removed from the query results as this example shows:
Given this table. The first 3 columns act as the primary key.
--------------1 2 3 4 5 6-------------------------------------------------------------------------------------------------------------------------------1 2 3 9 8 9-------------------------------------------------------------------------------------------------------------------------------1 4 3 9 8 9-------------------------------------------------------------------------------------------------------------------------------3 4 2 2 2 1-------------------------------------------------------------------------------------------------------------------------------2 3 4 1 1 3-------------------------------------------------------------------------------------------------------------------------------2 3 4 9 9 0--------
Since 1 2 3 is the composite primary key. The first 2 rows should be considered repeated and therefore eliminated from the results. Just as the two 2 3 4 rows.
The only rows in the result set should be:
1 4 3 9 8 9 and 3 4 2 2 2 1
Could you please help?
Thanks a lot in advance..

Your question doesn't fully make sense. A composite primary key would prevent duplicates in the table. So, these columns are not declared as a comosite primary key if the data contains duplicates.
If you just one one row for each group, you can use row_number() for this, something like this (the column names are obviously invalid):
select t.*
from (select t.*, row_number() over (partition by 1, 2, 3 order by 1) as seqnum
from table t
) t
where seqnum = 1;
If you want to delete extra rows, you can try:
delete from t
where rowid not in (select min(rowid) from table t group by 1, 2, 3);
EDIT:
If you want to remove cases where the rows are repeated, then you want count() instead of row_number():
select t.*
from (select t.*, count() over (partition by 1, 2, 3) as cnt
from table t
) t
where cnt > 1;

Related

How to select from SQL table so even and odd rows would be in separate columns?

There is a table ID in my PostgreSQL database.
select * from ID;
ID
1
2
3
4
5
6
7
8
I need to write a query that will give me this output:
id id
1 2
3 4
5 6
7 8
There are no such things as even and odd rows. SQL tables represent unordered sets (well technically multi-sets).
Assuming id is the ordering, you can split them using aggregation like this:
select min(id), max(id)
from (select t.*, row_number() over (order by id) - 1 as seqnum
from t
) t
group by floor(seqnum / 2)

Querying duplicates table into related sets

We have a process that creates a table of duplicate records based on some arbitrary rules (details not relevant).
Every record gets checked against all other records and if a suspected duplicate is found both it and the duplicate are stored in a dupes table to be manually reviewed.
This results in a table something like this:
dupId, originalId, duplicateId
1 1 2
2 1 3
3 1 4
4 2 3
5 2 4
6 3 4
7 5 6
8 5 7
9 6 7
10 8 9
You can see here record #1 has 3 other records it is similar to (#2,#3 and #4) and they are each similar to each other.
Record #5 has 2 duplicates (#6 and #7) and record #8 has only 1 (#9).
I want to query the duplicates into sets, so my results would look something like this:
setId recordId
1 1
1 2
1 3
1 4
2 5
2 6
2 7
3 8
3 9
But I am too old/slow/tired/rubbish and a bit out of my depth here.
Currently, when checking for duplicates if the record pairing is already in the table we don't insert it twice (i.e. you don't see both sides of the duplicate pairing) but can easily do so if it makes the querying simpler.
Any advice much appreciated!
Duplicates seems to be transitive, so you have all pairs. That is, the "original" id has the information you need.
But it is not included in the duplicates and you want that. So:
select dense_rank() over (order by originalid) as setid, duplicateid
from ((select originalid, duplicateid
from t
where not exists (select 1 from t t2 where t.originalid = t2.duplicateid)
) union all
(select distinct originalid, originalid
from t
where not exists (select 1 from t t2 where t.originalid = t2.duplicateid)
)
) i
order by setid;

SQL - Order by amount of occurrences

It's my first question here so I hope I can explain it well enough,
I want to order my data by amount of occurrences in the table.
My table is like this:
id Daynr
1 2
1 4
2 4
2 5
2 6
3 1
4 2
4 5
And I want it to sort it like this:
id Daynr
3 1
1 2
1 4
4 2
4 5
2 4
2 5
2 6
Player #3 has one day in the table, and Player #1 has 2.
My table is named "dayid"
Both id and Daynr are foreign keys, together making it a primary key
I hope this explains my problem enough, Please ask for more information it's my first time here.
Thanks in advance
You can do this by counting the number of times that things occur for each id. Most databases support window functions, so you can do this as:
select id, daynr
from (select t.*, count(*) over (partition by id) as cnt
from table t
) t
order by cnt, id;
You can also express this as a join:
select t.id, t.daynr
from table as t inner join
(select id, count(*) as cnt
from table
group by id
) as tg
on t.id = tg.id
order by tg.cnt, id;
Note that both of these include the id in the order by. That way, if two ids have the same count, all rows for the id will appear together.

PSQL get duplicate row

I have table like this-
id object_id product_id
1 1 1
2 1 1
4 2 2
6 3 2
7 3 2
8 1 2
9 1 1
I want to delete all rows except these-
1 1 1
4 2 2
6 3 2
9 1 2
Basically there are duplicates and I want to remove them but keep one copy intact.
what would be the most efficient way for this?
If this is a one-off then you can simply identify the records you want to keep like so:
SELECT MIN(id) AS id
FROM yourtable
GROUP BY object_id, product_id;
You want to check that this works before you do the next thing and actually throw records out. To actually delete those duplicate records you do:
DELETE FROM yourtable WHERE id NOT IN (
SELECT MIN(id) AS id
FROM yourtable
GROUP BY object_id, product_id
);
The MIN(id) obviously always returns the record with the lowest id for a set of (object_id, product_id). Change as desired.

Update grouped records in Oracle with an incremental value

So a dilemna, I have an Oracle table called T_GROUP. The records in the table have a unique id (ID) and they are part of a Study, identified by STUDY_ID, so multiple groups can be in the same Study.
CREATE TABLE T_GROUP
(
"ID" NUMBER(10,0),
"GROUP_NAME" VARCHAR2(255 CHAR),
"STUDY_ID" NUMBER(10,0)
)
The existing table has hundreds of records and I now add a new column called GROUP_INDEX:
ALTER TABLE T_GROUP ADD (
GROUP_INDEX NUMBER(10,0) DEFAULT(0)
);
After adding the column I need to run a script to update the GROUP_INDEX field as such: it should start at 1 and increment by 1 for each group within a study, starting with the lowest ID.
So now I have data as follows:
ID GROUP_NAME STUDY_ID GROUP_INDEX
-------------------------------------------
1 Group 1 3 0
2 Group 2 3 0
3 My Group 5 0
4 Big Group 5 0
5 Group X 5 0
6 Group Z 6 0
7 Best Group 6 0
After the update the group_index field should be as follows:
ID GROUP_NAME STUDY_ID GROUP_INDEX
-------------------------------------------
1 Group 1 3 1
2 Group 2 3 2
3 My Group 5 1
4 Big Group 5 2
5 Group X 5 3
6 Group Z 6 1
7 Best Group 6 2
The update will be run from sqlplus via a batch file. I've played around with group by and sub queries but I'm not having much luck, and having never used sqlplus I'm not sure if I can use variables, cursors etc. All tips greatly appreciated!
You should be able to use the analytic function row_number for this
UPDATE t_group t1
SET group_index = (SELECT rnk
FROM (SELECT id,
row_number() over (partition by study_id
order by id) rnk
FROM t_group) t2
WHERE t2.id = t1.id)
Here is a version using the MERGE statement. Might be faster than the sub-select (but doesn't have to be).
merge into t_group
using
(
select id,
row_number() over (partition by study_id order by id) rnk
from t_group
) t on t.id = t_group.id
when matched then update
set group_index = t.rnk;
This assumes that id is the primary key (or at least unique)
I can't test it right now, so there might be some syntax error in it.