Deleting duplicates on combination of two columns in oracle - sql

I have a table for example :
Port Table
S No
A Port
B port
1
80
100
2
90
110
3
100
80
4
94
106
I want to delete record no. 3 as it is having same combination as record no. 1 how to do this in oracle ?

You can use a single MERGE statement and the ROW_NUMBER analytic function combined with GREATEST and LEAST to find and delete the duplicates:
MERGE INTO table_name dst
USING (
SELECT ROWID rid,
ROW_NUMBER() OVER (
PARTITION BY LEAST(A_Port, B_Port), GREATEST(A_Port, B_Port)
ORDER BY S_no
) AS rn
FROM table_name
) src
ON (dst.ROWID = src.rid AND src.rn > 1)
WHEN MATCHED THEN
UPDATE SET A_port = NULL
DELETE WHERE 1 = 1;
Which, for your sample data:
CREATE TABLE table_name (S_No, A_Port, B_port) AS
SELECT 1, 80, 100 FROM DUAL UNION ALL
SELECT 2, 90, 110 FROM DUAL UNION ALL
SELECT 3, 100, 80 FROM DUAL UNION ALL
SELECT 4, 94, 106 FROM DUAL;
Will delete the 3rd row.
db<>fiddle here

Tried this in mysql, do some test/scenarios
SELECT P1.* FROM port_tbl AS P1
LEFT JOIN port_tbl AS P2 ON P1.port1 = P2.port2 OR P1.port2 = P2.port1
WHERE P1.id < P2.id OR ISNULL(P2.id)
ORDER BY P1.id;

Related

Oracle delete and update billions of duplicate records from a table

Versioned Table has duplicate address records and needs to be deleted as below
1: Find duplicate records from a table as below
Address
adr_id | ver_id | address
1 1 newYork
1 2 newYork
1 3 newYork
4 1 Washington
4 2 Washington
2: Insert new records as below
adr_id | ver_id | address
11 0 newYork
12 0 Washington
3: Delete existing duplicate records so final table would look like #2.
Note: Table has billions of records and required to be done in minimum time with best database technique.
Use a MERGE statement correlated on the ROWID pseudo-column:
MERGE INTO table_name dst
USING (
SELECT ROWID AS rid,
COUNT(*) OVER (PARTITION BY adr_id) AS cnt,
ROW_NUMBER() OVER (PARTITION BY adr_id ORDER BY ver_id) AS rn
FROM table_name
) src
ON (src.cnt > 1 AND dst.ROWID = src.rid)
WHEN MATCHED THEN
UPDATE
SET adr_id = YOUR_ADR_ID_SEQUENCE.NEXTVAL,
ver_id = 0
DELETE WHERE rn > 1;
Which, for the sample data:
CREATE SEQUENCE your_adr_id_sequence START WITH 11;
CREATE TABLE table_name (adr_id, ver_id, address) AS
SELECT 1, 1, 'newYork' FROM DUAL UNION ALL
SELECT 1, 2, 'newYork' FROM DUAL UNION ALL
SELECT 1, 3, 'newYork' FROM DUAL UNION ALL
SELECT 4, 1, 'Washington' FROM DUAL UNION ALL
SELECT 4, 2, 'Washington' FROM DUAL;
Then, after the MERGE the table contains:
ADR_ID
VER_ID
ADDRESS
11
0
newYork
14
0
Washington
fiddle

How to use distinct keyword on two columns in oracle sql?

I used distinct keyword on one column it did work very well but when I add the second column in select query it doesn't work for me as both columns have duplicate values. So I want to not show me the duplicate values in both columns. Is there any proper select query for that.
The sample data is:
For Col001:
555
555
7878
7878
89.
Col002:
43
43
56
56
56
67
67
67
79
79
79.
I want these data in this format:
Col001:
555
7878
89.
Col002:
43
56
67
79
I tried the following query:
Select distinct col001, col002 from tbl1
Use a set operator. UNION will give you the set of unique values from two subqueries.
select col001 as unq_col_val
from your_table
union
select col002
from your_table;
This presumes you're not fussed whether the value comes from COL001 or COL002. If you are fussed, this variant preserves that information:
select 'COL001' as source_col
,col001 as unq_col_val
from your_table
union
select 'COL002' as source_col
,col002
from your_table;
Note that this result set will contain more rows if the same value exists in both columns.
DISTINCT works across the entire row considering all values in the row and will remove duplicate values where the entire row is duplicated.
For example, given the sample data:
CREATE TABLE table_name (col001, col002) AS
SELECT 1, 1 FROM DUAL UNION ALL
SELECT 1, 2 FROM DUAL UNION ALL
SELECT 1, 3 FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 2, 2 FROM DUAL UNION ALL
--
SELECT 1, 2 FROM DUAL UNION ALL -- These are duplicates
SELECT 2, 2 FROM DUAL;
Then:
SELECT DISTINCT
col001,
col002
FROM table_name
Outputs:
COL001
COL002
1
1
1
2
1
3
2
1
2
2
And the duplicates have been removed.
If you want to only display distinct values for each column then you need to consider each column separately and can use something like:
SELECT c1.col001,
c2.col002
FROM ( SELECT DISTINCT
col001,
DENSE_RANK() OVER (ORDER BY col001) AS rnk
FROM table_name
) c1
FULL OUTER JOIN
( SELECT DISTINCT
col002,
DENSE_RANK() OVER (ORDER BY col002) AS rnk
FROM table_name
) c2
ON (c1.rnk = c2.rnk)
Which outputs:
COL001
COL002
1
1
2
2
null
3
db<>fiddle here

TSQL, Get top N unique rows across ordered groups

I have the following table of values, sorted by arbitrary segment id specified by the user. ( I know how to do that query and below are the results )
SegmentID SequenceID
3 100
3 200
3 400
3 430
1 100
1 200
1 300
1 410
2 100
2 200
2 300
2 420
I need a SQL query ( Sql Server 2012 ) that returns top N Records in order of Precedence where SequenceID is not repeated.
Example: user wants 7 sequences in order of segment preference: 3, 1,2.
The correct answer is
SegmentID SequenceID
3 100
3 200
3 400
3 430
1 300
1 410
2 420
in a nutshell, i need to traverse recordset from top to bottom, grab unique sequences as i go and add to the list.
How can I do that in a TSql statement?
create table #data (SegmentID int,SequenceID int);
insert into #data values
(3,100),
(3,200),
(3,400),
(3,430),
(1,100),
(1,200),
(1,300),
(1,410),
(2,100),
(2,200),
(2,300),
(2,420);
This table declares the ordering preference:
create table #prefs (Preference int, SegmentID int);
insert into #prefs values(1,3),(2,1),(3,2);
with cte as
(
select #data.SegmentID,
#data.SequenceID,
Preference,
row_number() over (partition by SequenceID order by Preference) rn
from #data
inner join #prefs on #data.SegmentID = #prefs.SegmentID
)
select SegmentId,
SequenceID
from cte
where rn = 1
order by Preference, SequenceID;
DEMO:
http://rextester.com/JKNKD15000
With cte (SegmentID, SequenceID) as
(SELECT 3, 100 UNION ALL
SELECT 3, 200 UNION ALL
SELECT 3, 400 UNION ALL
SELECT 3, 430 UNION ALL
SELECT 1, 100 UNION ALL
SELECT 1, 200 UNION ALL
SELECT 1, 300 UNION ALL
SELECT 1, 410 UNION ALL
SELECT 2, 100 UNION ALL
SELECT 2, 200 UNION ALL
SELECT 2, 300 UNION ALL
SELECT 2, 420),
userOrder (SegmentID, orderID) as (
SELECT 3, 1 UNION ALL
SELECT 1, 2 UNION ALL
SELECT 2, 3),
Results (SegmentID, SequenceID, RN, orderID) as (
Select A.*
, Row_number() over (Partition by A.SequenceID order by B.orderID) RN
, B.orderID
from cte A
INNER JOIN userOrder B
on A.SegmentID = B.SegmentID)
Select Top 7 *
from results where RN = 1
order by OrderID, SequenceID

Oracle sql group sum

I have table With ID,Sub_ID and value coloumns
ID SUB_ID Value
100 1 100
100 2 150
101 1 100
101 2 150
101 3 200
102 1 100
SUB ID can vary from 1..maxvalue( In this example it is 3). I need Sum of values for each Sub_ID. If SUB_ID is less than MAXVALUE for a particlaur ID then it should take MAX(SUB_ID) of each ID As shown below ( In this example for ID=100 for SUB_ID 3 it should take 150 i.e 2<3 so value=150))
SUB_ID SUM(values) Remarks
1 300 (100+100+100)
2 400 (150+150+100)
3 450 (150+200+100)
This can be easily done in PL/SQL . Can we use SQL for the same using Model Clause or any other options
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TableA ( ID, SUB_ID, Value ) AS
SELECT 100, 1, 100 FROM DUAL
UNION ALL SELECT 100, 2, 150 FROM DUAL
UNION ALL SELECT 101, 1, 100 FROM DUAL
UNION ALL SELECT 101, 2, 150 FROM DUAL
UNION ALL SELECT 101, 3, 200 FROM DUAL
UNION ALL SELECT 102, 1, 100 FROM DUAL
Query 1:
WITH sub_ids AS (
SELECT LEVEL AS sub_id
FROM DUAL
CONNECT BY LEVEL <= ( SELECT MAX( SUB_ID ) FROM TableA )
),
max_values AS (
SELECT ID,
MAX( VALUE ) AS max_value
FROM TableA
GROUP BY ID
)
SELECT s.SUB_ID,
SUM( COALESCE( a.VALUE, m.max_value ) ) AS total_value
FROM sub_ids s
CROSS JOIN
max_values m
LEFT OUTER JOIN
TableA a
ON ( s.SUB_ID = a.SUB_ID AND m.ID = a.ID )
GROUP BY
s.SUB_ID
Results:
| SUB_ID | TOTAL_VALUE |
|--------|-------------|
| 1 | 300 |
| 2 | 400 |
| 3 | 450 |
Try this
SELECT SUB_ID,SUM(values),
(SELECT DISTINCT SUBSTRING(
(
SELECT '+'+ CAST(values AS VARCHAR)
FROM table_Name AS T2
WHERE T2.SUB_ID = d.SUB_ID
FOR XML PATH ('')
),2,100000)[values]) as values
FROm table_Name d
GROUP BY SUB_ID
How about something like this:
select max_vals.sub_id, sum(nvl(table_vals.value,max_vals.max_value)) as sum_values
from (
select all_subs.sub_id, t1.id, max(t1.value) as max_value
from your_table t1
cross join (select sub_id from your_table) all_subs
group by all_subs.sub_id, t1.id
) max_vals
left outer join your_table table_vals
on max_vals.id = table_vals.id
and max_vals.sub_id = table_vals.sub_id
group by max_vals.sub_id;
The inner query gets you a list of all sub_id/id combinations and their fall-back values. The out query uses an nvl to use the table value if it exists and the fall-back value if it doesn't.

How can I find unoccupied id numbers in a table?

In my table I want to see a list of unoccupied id numbers in a certain range.
For example there are 10 records in my table with id's: "2,3,4,5,10,12,16,18,21,22" and say that I want to see available ones between 1 and 25. So I want to see a list like:
1,6,7,89,11,13,14,15,17,19,20,23,24,25
How should I write my sql query?
Select the numbers form 1 to 25 and show only those that are not in your table
select n from
( select rownum n from dual connect by level <= 25)
where n not in (select id from table);
Let's say you a #numbers table with three numbers -
CREATE TABLE #numbers (num INT)
INSERT INTO #numbers (num)
SELECT 1
UNION
SELECT 3
UNION
SELECT 6
Now, you can use CTE to generate numbers recursively from 1-25 and deselect those which are in your #numbers table in the WHERE clause -
;WITH n(n) AS
(
SELECT 1
UNION ALL
SELECT n+1 FROM n WHERE n < 25
)
SELECT n FROM n
WHERE n NOT IN (select num from #numbers)
ORDER BY n
OPTION (MAXRECURSION 25);
You can try using the "NOT IN" clause:
select
u1.user_id + 1 as start
from users as u1
left outer join users as u2 on u1.user_id + 1 = u2.id
where
u2.id is null
see also SQL query to find Missing sequence numbers
You need LISTAGG to get the output in a single row.
SQL> WITH DATA1 AS(
2 SELECT LEVEL rn FROM dual CONNECT BY LEVEL <=25
3 ),
4 data2 AS(
5 SELECT 2 num FROM dual UNION ALL
6 SELECT 3 FROM dual UNION ALL
7 SELECT 4 from dual union all
8 SELECT 5 FROM dual UNION ALL
9 SELECT 10 FROM dual UNION ALL
10 SELECT 12 from dual union all
11 SELECT 16 from dual union all
12 SELECT 18 FROM dual UNION ALL
13 SELECT 21 FROM dual UNION ALL
14 SELECT 22 FROM dual)
15 SELECT listagg(rn, ',')
16 WITHIN GROUP (ORDER BY rn) num_list FROM data1
17 WHERE rn NOT IN(SELECT num FROM data2)
18 /
NUM_LIST
----------------------------------------------------
1,6,7,8,9,11,13,14,15,17,19,20,23,24,25
SQL>