SQL Joining two tables and removing the duplicates from the two tables but without loosing any duplicates from the tables itslef - sql

I want to join two tables and remove duplicates from both the tables but keeping any duplicate value found in the first table.
T1
Name
-----
A
A
B
C
T2
Name
----
A
D
E
Expected result
A - > FROM T1
A - > FROM T1
B
C
D
E
I tried union but removes all duplicates of 'A' from both tables.
How can I achieve this?

Filter T2 before UNION ALL
select col
from T1
union all
select col
from T2
where not exists (select 1 from T1 where T1.col = T2.col)

Assuming you want the number of duplicates from the table with the most repetitions for each value, you can do it with the ROW_NUMBER() windowing function, to eliminate duplicates by their sequence with the set of repetitions in each table.
SELECT Name FROM (
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
UNION
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
) x
ORDER BY Name
To see how this works out, we add two B rows to T2 then do this:
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
Name Row
A 1
A 2
B 1
C 1
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
Name Row
A 1
B 1
B 2
D 1
E 1
Now UNION them without ALL to combine and eliminate duplicates:
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
UNION
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
Name Row
A 1
A 2
B 1
B 2
C 1
D 1
E 1
The final query up top is then just eliminating the Row column and sorting the result, to ensure ascending order.
See SQL Fiddle for demo.

select * from T1
union all
select * from T2 where name not in (select distinct name from T1)
Sql Fiddle Demo

you should use "union all" instead of "union".
"union" remove other duplicated records while "union all" gives all of them.
for you result,because of we filtered intersects from table 2 in "where",we don't need "UNION ALL"
select col1 from t1
union
select col1 from t2 where t2.col1 not in(select t1.col1 from t1)

I D'not know the following code is good practice or not But it's working
select name from T1
UNION
select name from T2 Where name not in (select name from T1)
The Above Query Filter the value based on T1 value and then join two tables values and show the result.
I hope it's helps you thanks.
Note : It's not better way to get result it's affect your performance.
I sure i update the better solution after my research

You want all names from T1 and all names from T2 except the names that are in T1.
So you can use UNION ALL for the 2 cases and the operator EXCEPT to filter the rows of T2:
SELECT Name FROM T1
UNION ALL
(
SELECT Name FROM T2
EXCEPT
SELECT Name FROM T1
)
See the demo.
Results:
> | Name |
> | :--- |
> | A |
> | A |
> | B |
> | C |
> | D |
> | E |

Related

Remove duplicates in Select query based on one column

I want to select without duplicate ids and keep row '5d' and not '5e' in select statement.
table
id | name
1 | a
2 | b
3 | c
5 | d
5 | e
I tried:
SELECT id, name
FROM table t
INNER JOIN (SELECT DISTINCT id FROM table) t2 ON t.id = t2.id
For the given example an aggregation using min() would work.
SELECT id,
min(name) name
FROM table
GROUP BY id;
You can also use ROW_NUMBER():
SELECT id, name
FROM (
SELECT id, name, ROW_NUMBER() OVER(PARTITION BY id ORDER BY name) rn
FROM mytable
) x
WHERE rn = 1
This will retain the record that has the smallest name (so '5d' will come before '5e'). With this technique, you can also use a sort criteria on another column that the one where duplicates exists (which an aggregate query with MIN() cannot do). Also, queries using window functions usually perform better than the equivalent aggregate query.
If you want to keep the row with the smallest name then you can use not exists:
select t.* from tablename t
where not exists (
select 1 from tablename
where id = t.id and name < t.name
)

how to get dupes from table using group by and/or having

If I have this table:
id | aux_id | name
------------------
1 | 22 | foo
2 | 22 | bar
3 | 19 | baz
How can I get this result, showing names that share an aux_id with at least one other record?
name
----
foo
bar
I know I need to use GROUP BY and/or HAVING but this isn't working:
SELECT name FROM my_table
GROUP BY aux_id
HAVING COUNT(aux_id) > 1
Column 'name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
How about exists?
select t.name
from my_table t
where exists (select 1
from my_table t2
where t2.aux_id = t.aux_id and t2.name <> t.name
);
I would use exists :
select t.name
from table t
where exists (select 1 from table t1 where t1.aux_id = t.aux_id and t1.id <> t.id);
This will have a advantage to cover all columns if you want, without using group by clause.
An alternative, just for fun...
WITH
duplication_counts AS
(
SELECT
*,
COUNT(*) OVER (PARTITION BY aux_id) AS aux_id_occurrences
FROM
my_table
)
SELECT
*
FROM
duplication_counts
WHERE
aux_id_occurrences > 1
Group by works IMHO (performance would not be good in large data as it would be with EXISTS):
select * from myTable
where aux_id in
(select aux_id
from myTable
group by aux_id
having count(*) > 1)
SQLFiddle Demo

Delete duplicate rows based on a condition

I have a table that has ID and EventDate. It has duplicate rows as I used Union of two tables. Now I got to have the rows with the minimum Eventdate and remove the other duplicates.
the table for eg
ID | Date
--- | ---
1 | 10/27/1993
1 | 10/27/1994
2 | 10/17/1993
2 | 08/15/1993
Delete duplicate rows based on condition
You can use ROW_NUMBER:
;WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY EventDate)
FROM dbo.YourTable
)
DELETE FROM CTE
WHERE RN > 1;
Use this!
delete A
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY [COLUMN] ORDER BY EventDate ASC),*
FROM dbo.Your_Table
) AS A
where rn > 1
If we talk about Firebird it is enough
DELETE FROM table1 t1_1
WHERE EXISTS(
SELECT t1_2.id FROM table1 t1_2 WHERE t1_1.EventDate>t1_2.EventDate
);
As documentation (if we about MySQL) you cannot "delete from a table and select from the same table in a subquery".
So
CREATE table1 LIKE table2;
INSERT table2 SELECT * FROM table1;
DELETE FROM table1
WHERE EXISTS(
SELECT t2.id FROM table2 t2 WHERE table1.EventDate>t2.EventDate
);
DROP TABLE table2;
Where table1 you original table.

PL/SQL pseudo Sequencing

I have the following scenario
ID SEQ
-- ---
123 2
123 4
What I want to be able to do is produce a list of these values and fill in the missing numbers to a maximum number say 6 for example (which I have from another source) where those number do not exist with the ID on the table.
ID NEW_SEQ
-- ---
123 1
123 2
123 3
123 4
123 5
123 6
Thanks
C
This generates a sequence of numbers from 1 through 6, cross joins with all the ids of the table to associate each of the sequence numbers with each id, then removes the already existing combinations.
SELECT t.id, s.seq
FROM (SELECT DISTINCT id FROM myTable) t
,(SELECT rownum AS seq
FROM dual
CONNECT BY LEVEL <= 6) s
MINUS
SELECT id, seq
FROM myTable
ORDER BY 1, 2
If you have a list of the numbers you want to use in OTHER_TABLE then I suggest you use an outer join, as in:
SELECT o.ID, o.NEW_SEQ
FROM OTHER_TABLE o
LEFT OUTER JOIN (SELECT ID, SEQ FROM MY_TABLE) t
ON (o.ID = t.ID AND o.NEW_SEQ = t.SEQ)
WHERE t.SEQ IS NULL
ORDER BY o.ID, o.NEW_SEQ
The outer join will include all rows from the first table (OTHER_TABLE, in this case) joined with the rows which exist from the second table (here, MY_TABLE). If there is a row in OTHER_TABLE which does not have a matching row in MY_TABLE, the fields from MY_TABLE will be NULL - thus, by checking for t.SEQ being NULL you're able to find the rows which exist in OTHER_TABLE but which are not in MY_TABLE.
SQLFiddle here.
Share and enjoy.

How to select distinct rows with a specified condition

Suppose there is a table
_ _
a 1
a 2
b 2
c 3
c 4
c 1
d 2
e 5
e 6
How can I select distinct minimum value of all the rows of each group?
So the expected result here is:
_ _
a 1
b 2
c 1
d 2
e 5
EDIT
My actual table contains more columns and I want to select them all. The rows differ only in the last column (the second one in the example). I'm new to SQL and possibly my question is ill-formed in it initial view.
The actual schema is:
| day | currency ('EUR', 'USD') | diff (integer) | id (foreign key) |
The are duplicate pairs (day, currency) that differ by (diff, id). I want to see a table with uniquer pairs (day, currency) with a minimum diff from the original table.
Thanks!
in your case it's as simple as this:
select column1, min(column2) as column2
from table
group by column1
for more than two columns I can suggest this:
select top 1 with ties
t.column1, t.column2, t.column3
from table as t
order by row_number() over (partition by t.column1 order by t.column2)
take a look at this post https://stackoverflow.com/a/13652861/1744834
You can use the ranking function ROW_NUMBER() to do this with a CTE. Especially, if there are more column other than these two column, it will give the distict values like so:
;WITH RankedCTE
AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY column1 ORDER BY Colmn2 ) rownum
FROM Table
)
SELECT column1, column2
FROM RankedCTE
WHERE rownum = 1;
This will give you:
COLUMN1 COLUMN2
a 1
b 2
c 1
d 2
e 5
SQL Fiddle Demo
SELECT ColOne, Min(ColTwo)
FROM Table
GROUP BY ColOne
ORDER BY ColOne
PS: not front of a,machine, but give above a try please.
select MIN(col2),col1
from dbo.Table_1
group by col1