Remove duplicates using only where condition - sql

Today, i got a problem from a friend.
Problem - Write a SQL query using UNION ALL(not union) that uses the where clause to eliminate duplicates.
I can not use group by expression
I can not use unique , distinct keywords.
Input -
id(Table 1)
1
2
fk_id(Table 2)
1
1
2
I gave him the solution below query
select id from
(
select id , row_number() over(partition by id order by id) rn from
(
select id from T1
union all
select fk_ID id from T2
)
)where rn = 1;
Output -
id
1
2
which is generating unique id's.
Now suspense by him i also can not use row_number(). i just have to use where condition. i am writing query on oracle database.
Please suggest.
Thanks in advance.

From its name and the data shown, we can assume that id in table t1 is unique.
From its name and the data shown, we can assume that fk_id in table t2 is a foreign key to table1.id.
So the union of the IDs in the two tables are simply the IDs that we find in table t1.
As we are forced to use UNION ALL on the two tables, though, we can use a pseudo UNION ALL not adding anything:
select id from t1
union all
select fk_id from t2 where 1 = 2;
If t2.fk_id were not a foreign key referencing t1.id, we would use NOT EXISTS or NOT IN in the where clause instead. If this is to give a result without duplicates, however, there must be no duplicates in t2 then to start with. (As you are showing that duplicate values in t2 do exist, this approach would not work then.) Here is a query for unique values from t1 plus unique values from t2 that are not referencing the t1 values:
select id from t1
union all
select fk_id from t2 where fk_id not in (select id from t1);

In a more generic case, where you can have duplicates in both tables, this could be a way.
test data:
create table table1(id) as (
select 1 from dual union all
select 1 from dual union all
select 2 from dual union all
select 2 from dual union all
select 1 from dual
)
create table table2(fk_id) as (
select 1 from dual union all
select 1 from dual union all
select 1 from dual union all
select 3 from dual union all
select 4 from dual union all
select 1 from dual union all
select 4 from dual union all
select 2 from dual
)
query:
with tab1_union_all_tab2 as (
select 'tab1'||rownum as uniqueId, id from table1 UNION ALL
select 'tab2'||rownum , fk_id from table2
)
select id
from tab1_union_all_tab2 u1
where not exists ( select 1
from tab1_union_all_tab2 u2
where u1.id = u2.id
and u1.uniqueId < u2.uniqueId
)
result:
ID
----------
3
4
1
2
This should clarify the idea behind:
with tab1_union_all_tab2 as (
select 'tab1'||rownum as uniqueId, id from table1 UNION ALL
select 'tab2'||rownum , fk_id from table2
)
select uniqueId, id,
( select nvl(listagg ( uniqueId, ', ') within group ( order by uniqueId), 'NO DUPLICATES')
from tab1_union_all_tab2 u2
where u1.id = u2.id
and u1.uniqueId < u2.uniqueId
) duplicates
from tab1_union_all_tab2 u1
UNIQUEID ID DUPLICATES
---------- ---------- --------------------------------------------------
tab11 1 tab12, tab15, tab21, tab22, tab23, tab26
tab12 1 tab15, tab21, tab22, tab23, tab26
tab13 2 tab14, tab28
tab14 2 tab28
tab15 1 tab21, tab22, tab23, tab26
tab21 1 tab22, tab23, tab26
tab22 1 tab23, tab26
tab23 1 tab26
tab24 3 NO DUPLICATES
tab25 4 tab27
tab26 1 NO DUPLICATES
tab27 4 NO DUPLICATES
tab28 2 NO DUPLICATES
As rightly observed by Thorsten Kettner, you can easily edit this to use rowid instead of building a unique id by concatenating a string and the rownum:
with tab1_union_all_tab2 as (
select rowid uniqueId, id from table1 UNION ALL
select rowid , fk_id from table2
)
select id
from tab1_union_all_tab2 u1
where not exists ( select 1
from tab1_union_all_tab2 u2
where u1.id = u2.id
and u1.uniqueId < u2.uniqueId
)

write a where statement for the second select in the union all as where id != fk_id

Related

SQL - How to Order By in UNION query

Is there a way to union two tables, but keep the rows from the first table appearing first in the result set? However orderby column is not in select query
For example:
Table 1
name surname
-------------------
John Doe
Bob Marley
Ras Tafari
Table 2
name surname
------------------
Lucky Dube
Abby Arnold
Result
Expected Result:
name surname
-------------------
John Doe
Bob Marley
Ras Tafari
Lucky Dube
Abby Arnold
I am bringing Data by following query
SELECT name,surname FROM TABLE 1 ORDER BY ID
UNION
SELECT name,surname FROM TABLE 2
The above query is not keeping track of order by after union.
P.S - I dont want to show ID in my select query
I am getting ORDER BY Column by joining tables. Following is my real query
SELECT tbl_Event_Type_Sort_Orders.Appraisal_Event_Type_ID AS Appraisal_Event_Type_ID , ISNULL(tbl_Appraisal_Event_Types.Appraisal_Event_Type_Display_Name, 'UnCategorized') AS Appraisal_Event_Type_Display_Name
INTO #temptbl
FROM tbl_Event_Type_Sort_Orders
INNER JOIN tbl_Appraisal_Event_Types
ON tbl_Event_Type_Sort_Orders.Appraisal_Event_Type_ID = tbl_Appraisal_Event_Types.Appraisal_Event_Type_ID
WHERE 1=1
AND User_Name='abc'
ORDER BY tbl_Event_Type_Sort_Orders.Sort_Order
SELECT * FROM #temptbl
UNION
SELECT DISTINCT (tbl_Appraisal_Event_Types.Appraisal_Event_Type_ID) AS Appraisal_Event_Type_ID , ISNULL(tbl_Appraisal_Event_Types.Appraisal_Event_Type_Display_Name, 'UnCategorized') AS Appraisal_Event_Type_Display_Name
FROM tbl_Appraisal_Event_Types
INNER JOIN tbl_Appraisal_Events
ON tbl_Appraisal_Event_Types.Appraisal_Event_Type_ID = tbl_Appraisal_Events.Event_Type_ID
INNER JOIN tbl_Appraisals
ON tbl_Appraisal_Events.Appraisal_ID = tbl_Appraisal_Events.Appraisal_ID
WHERE 1=1
AND ((tbl_Appraisals.Assigned_To_Staff_User) = 'abc' OR (tbl_Appraisals.Assigned_To_Staff_User2) = 'abc' OR (tbl_Appraisals.Assigned_To_Staff_User3) = 'abc')
Put a UNION ALL in a derived table. To keep duplicate elimination, do select distinct and also add a NOT EXISTS to second select to avoid returning same person twice if found in both tables:
select name, surname
from
(
select distinct name, surname, 1 as tno
from table1
union all
select distinct name, surname, 2 as tno
from table2 t2
where not exists (select * from table1 t1
where t2.name = t1.name
and t2.surname = t1.surname)
) dt
order by tno, surname, name
You can use a column for the table and one for the ID to order by:
SELECT x.name, x.surname FROM (
SELECT ID, TableID = 1, name, surname
FROM table1
UNION ALL
SELECT ID = -1, TableID = 2, name, surname
FROM table2
) x
ORDER BY x.TableID, x.ID
You can write as below, if you are ok with duplicate data then please use UNION ALL it will be faster:
SELECT NAME, surname FROM (
SELECT ID,name,surname FROM TABLE 1
UNION
SELECT ID,name,surname FROM TABLE 2 ) t ORDER BY ID
this will order the first row sets first then by anything you need
(haven't tested the code)
;with cte_1
as
(SELECT ID,name,surname,1 as table_id FROM TABLE 1
UNION
SELECT ID,name,surname,2 as table_id FROM TABLE 2 )
SELECT name, surname
FROM cte_1
ORDER BY table_id,ID
simply use a UNION clause with out order by.
SELECT name,surname FROM TABLE 1
UNION
SELECT name,surname FROM TABLE 2
if you wanted to order first table use the below query.
;WITH cte_1
AS
(SELECT name,surname,ROW_NUMBER()OVER(ORDER BY Id)b FROM TABLE 1 )
SELECT name,surname
FROM cte_1
UNION
SELECT name,surname
FROM TABLE 2

ORACLE join two table with comma separated ids

I have two tables
Table 1
ID NAME
1 Person1
2 Person2
3 Person3
Table 2
ID GROUP_ID
1 1
2 2,3
The IDs in all the columns above refer to the same ID (Example - a Department)
My Expected output (by joining both the tables)
GROUP_ID NAME
1 Person1
2,3 Person2,Person3
Is there a query with which I can achieve this.
It can be done. You shouldn't do it, but perhaps you don't have the power to change the world. (If you have a say in it, you should normalize your table design - in your case, both the input and the output fail the first normal form).
Answering more as good practice for myself... This solution guarantees that the names will be listed in the same order as the id's. It is not the most efficient, and it doesn't deal with id's in the list that are not found in the first table (it simply discards them instead of leaving a marker of some sort).
with
table_1 ( id, name ) as (
select 1, 'Person1' from dual union all
select 2, 'Person2' from dual union all
select 3, 'Person3' from dual
),
table_2 ( id, group_id ) as (
select 1, '1' from dual union all
select 2, '2,3' from dual
),
prep ( id, lvl, token ) as (
select id, level, regexp_substr(group_id, '[^,]', 1, level)
from table_2
connect by level <= regexp_count(group_id, ',') + 1
and prior id = id
and prior sys_guid() is not null
)
select p.id, listagg(t1.name, ',') within group (order by p.lvl) as group_names
from table_1 t1 inner join prep p on t1.id = p.token
group by p.id;
ID GROUP_NAMES
---- --------------------
1 Person1
2 Person2,Person3
select t2.group_id, listagg(t1.name,',') WITHIN GROUP (ORDER BY 1)
from table2 t2, table1 t1
where ','||t2.group_id||',' like '%,'||t1.id||',%'
group by t2.id, t2.group_id
Normalize you data model, this perversion !!! Сomma separated list should not exist in database. Only individual rows per data unit.

Oracle SQL Query IN

I have following query, that's not working.
select * from table where id in (
1,2, (select id from another_table)
)
How i can rewrite it?
How about
select * from table
where id in (1,2)
or id in (select id from another_table)
Take care and use parentheses when adding additional WHERE-conditions using and!!!
select *
from table
where id in (1,2) OR id in(
select id from another_table
)
select * from table where id in (
select 1 as id from dual
union all
select 2 as id from dual
union all
select id from another_table
)
select * from table where id in (
select 1 from dual
union all
select 2 from dual
union all
select id from another_table);
I'm using union because this is faster than using an OR clause which also can be used.

'SELECT IN' with item list containing duplicates

I'm doing
SELECT Name WHERE Id IN (3,4,5,3,7,8,9)
where in this case the '3' Id is duplicated.
The query automatically excludes the duplicated items while for me would be important to get them all.
Is there a way to do that directly in SQL?
The query doesn't exclude duplicates, there just isn't any duplicates to exclude. There is only one record with id 3 in the table, and that is included because there is a 3 in the in () set, but it's not included twice because the 3 exists twice in the set.
To get duplicates you would have to create a table result that has duplicates, and join the table against that. For example:
select t.Name
from someTable t
inner join (
select id = 3 union all
select 4 union all
select 5 union all
select 3 union all
select 7 union all
select 8 union all
select 9
) x on x.id = t.id
Try this:
SELECT Name FROM Tbl
JOIN
(
SELECT 3 Id UNION ALL
SELECT 4 Id UNION ALL
SELECT 5 Id UNION ALL
SELECT 3 Id UNION ALL
SELECT 7 Id UNION ALL
SELECT 8 Id UNION ALL
SELECT 9 Id
) Tmp
ON tbl.Id = Tmp.Id

SQL Select Condition Question

I have a quick question about a select statement condition.
I have the following table with the following items. What I need to get is the object id that matches both type id's.
TypeId ObjectId
1 10
2 10
1 11
So I need to get both object 10 because it matches type id 1 and 2.
SELECT ObjectId
FROM Table
WHERE TypeId = 1
AND TypeId = 2
Obviously this doesn't work because it won't match both conditions for the same row. How do I perform this query?
Also note that I may pass in 2 or more type id's to narrow down the results.
Self-join:
SELECT t1.ObjectId
FROM Table AS t1
INNER JOIN Table AS t2
ON t1.ObjectId = t2.ObjectId
AND t1.TypeId = 1
AND t2.TypeId = 2
Note sure how you want the behavior to work when passing in values, but that's a start.
I upvoted the answer from #Cade Roux, and that's how I would do it.
But FWIW, here's an alternative solution:
SELECT ObjectId
FROM Table
WHERE TypeId IN (1, 2)
GROUP BY ObjectId
HAVING COUNT(*) = 2;
Assuming uniqueness over TypeId, ObjectId.
Re the comment from #Josh that he may need to search for three or more TypeId values:
The solution using JOIN requires a join per value you're searching for. The solution above using GROUP BY may be easier if you find yourself searching for an increasing number of values.
This code is written with Oracle in mind. It should be general enough for other flavors of SQL
select t1.ObjectId from Table t1
join Table t2 on t2.TypeId = 2 and t1.ObjectId = t2.ObjectId
where t1.TypeId = 1;
To add additional TypeIds, you just have to add another join:
select t1.ObjectId from Table t1
join Table t2 on t2.TypeId = 2 and t1.ObjectId = t2.ObjectId
join Table t3 on t3.TypeId = 3 and t1.ObjectId = t3.ObjectId
join Table t4 on t4.TypeId = 4 and t1.ObjectId = t4.ObjectId
where t1.TypeId = 1;
Important note: as you add more joins, performance will suffer a LOT.
In regards to Bill's answer you can change it to the following to get rid of the need to assume uniqueness:
SELECT ObjectId
FROM (SELECT distinct ObjectId, TypeId from Table)
WHERE TypeId IN (1, 2)
GROUP BY ObjectId
HAVING COUNT(*) = 2;
His way of doing it scales better as the number of types gets larger.
Try this
Sample Input:(Case 1)
declare #t table(Typeid int,ObjectId int)
insert into #t
select 1,10 union all select 2,10 union all
select 1,11
select * from #t
Sample Input:(Case 2)
declare #t table(Typeid int,ObjectId int)
insert into #t
select 1,10 union all select 2,10 union all
select 3,10 union all select 4,10 union all
select 5,10 union all select 6,10 union all
select 1,11 union all select 2,11 union all
select 3,11 union all select 4,11 union all
select 5,11 union all select 1,12 union all
select 2,12 union all select 3,12 union all
select 4,12 union all select 5,12 union all
select 6,12
select * from #t
Sample Input:(Case 3)[Duplicate entries are there]
declare #t table(Typeid int,ObjectId int)
insert into #t
select 1,10 union all select 2,10 union all
select 1,10 union all select 2,10 union all
select 3,10 union all select 4,10 union all
select 5,10 union all select 6,10 union all
select 1,11 union all select 2,11 union all
select 3,11 union all select 4,11 union all
select 5,11 union all select 1,12 union all
select 2,12 union all select 3,12 union all
select 4,12 union all select 5,12 union all
select 6,12 union all select 3,12
For case 1, the output should be 10
For case 2 & 3, the output should be 10 and 12
Query:
select X.ObjectId from
(
select
T.ObjectId
,count(ObjectId) cnt
from(select distinct ObjectId,Typeid from #t)T
where T.Typeid in(select Typeid from #t)
group by T.ObjectId )X
join (select max(Typeid) maxcnt from #t)Y
on X.cnt = Y.maxcnt