Filter duplicates on cross fields

Filter duplicates on cross fields - sql

So, i have some records:
[ fid sid ]
1 2
1 3
1 4
2 1
2 3
2 4
3 1
3 2
3 4
....
Both fields contains ids. I need to get only uniq or first not-uniq records, but uniq by cross fields.
For example [2,1] and [1,2] are not uniq.
In the end i want to have:
[ fid sid ]
1 2
1 3
1 4
2 3
2 4
3 4
....
Those are records, that have been filtered:
[ fid sid ]
2 1
3 1
3 2
....
Thanks for the answers!

If you have no duplicates, you can do:
select fid, sid
from t
where fid <= sid
union all
select fid, sid
from t
where fid > sid and
not exists (select 1 from t t2 where t2.fid = t.sid and t2.sid = t.fid);
If you do have duplicates and don't care about the ordering, you can do:
select (case when fid < sid then fid else sid end) as sid,
(case when fid < sid then sid else fid end) as sid
from t
group by (case when fid < sid then fid else sid end),
(case when fid < sid then sid else fid end);
This could produce pairs that are not in the original data (because the inverse is in the data).

Here's a way that uses a left self-join that only keeps those without a match.
Example code:
declare #T table (fid int, [sid] int);
insert into #T (fid, [sid]) values
(1, 2),(1, 3),(1, 4),
(2, 1),(2, 3),(2, 4),
(3, 1),(3, 2),(3, 4);
select distinct t.fid, t.[sid]
from #T t
left join #T t2 on (t2.[sid] = t.fid and t2.fid = t.[sid] and t2.fid < t2.[sid])
where t2.fid is null
order by t.fid, t.[sid];
Result:
fid sid
1 2
1 3
1 4
2 3
2 4
3 4
Same result with a NOT EXISTS:
select distinct fid, [sid]
from #T t
where not exists (
select 1 from #T t2
where t2.[sid] = t.fid and t2.fid = t.[sid] and t2.fid < t2.[sid]
)
order by fid, [sid];

Another aproach where ordering is available would be:
SELECT DISTINCT [fid] ,[sid]
FROM aaa where
CAST(fid AS VARCHAR(2) ) + '.'+ CAST(sid AS VARCHAR(2) ) not in
(SELECT DISTINCT CAST(sid AS VARCHAR(2) ) + '.'+ CAST(fid AS VARCHAR(2))
FROM aaa)
Although this might not be recommended based on how many times you execute is and how much data will be processed.

Related

Oracle SQL: How to select only ID‘s which are member in specific groups?

I want to select only those ID‘s which are in specific groups.
For example:
ID GroupID
1 11
1 12
2 11
2 12
2 13
Here I want to select the ID's which are in the groups 11 and 12 but in no other groups.
So the result should show just the ID 1 and not 2.
Can someone provide a SQL for that?
I tried it with
SELECT ID FROM table
WHERE GroupID = 11 AND GroupID = 12 AND GroupID != 13;
But that didn't work.

You can use aggregation:
select id
from mytable
group by id
having min(groupID) = 11 and max(groupID) = 12
This having condition ensures that the given id belongs to groupIDs 11 and 12, and to no other group. This works because 11 and 12 are sequential numbers.
Other options: if you want ids that belong to group 11 or 12 (not necessarily both), and to no other group, then:
having sum(case when groupId in (11, 12) then 1 end) = count(*)
If numbers are not sequential, and you want ids in both groups (necessarily) and in no other group:
having
max(case when groupID = 11 then 1 end) = 1
and max(case when groupID = 12 then 1 end) = 1
and max(case when groupID in (11, 12) then 0 else 1 end) = 0

SELECT t.id FROM table t
where exists(
SELECT * FROM table
where group = 11
and t.id = id
)
and exists(
SELECT * FROM table
where group = 12
and t.id = id
)
and not exists(
SELECT * FROM table
where group = 13
and t.id = id
)
group by t.id

One method is conditional aggregation:
select id
from t
group by id
having sum(case when groupid = 1 then 1 else 0 end) > 0 and
sum(case when groupid = 2 then 1 else 0 end) > 0 and
sum(case when groupid in (1, 2) then 1 else 0 end) = 0 ;

You can use GROUP BY with HAVING and a conditional COUNT:
SELECT id
FROM table_name
GROUP BY ID
HAVING COUNT( CASE Group_ID WHEN 11 THEN 1 END ) > 0
AND COUNT( CASE Group_ID WHEN 12 THEN 1 END ) > 0
AND COUNT( CASE WHEN Group_ID NOT IN ( 11, 12 ) THEN 1 END ) = 0
Or you can use collections:
CREATE TYPE int_list IS TABLE OF NUMBER(8,0);
and:
SELECT id
FROM table_name
GROUP BY id
HAVING int_list( 11, 12 ) SUBMULTISET OF CAST( COLLECT( group_id ) AS int_list )
AND CARDINALITY( CAST( COLLECT( group_id ) AS int_list )
MULTISET EXCEPT int_list( 11, 12 ) ) = 0
(Using collections has the advantage that you can pass the collection of required values as a single bind parameter whereas using conditional aggregation is probably going to require dynamic SQL if you want to pass a variable number of items to the query.)
Both output:
| ID |
| -: |
| 1 |
db<>fiddle here

Use joins:
SELECT DISTINCT c11.ID
FROM (SELECT ID FROM WORK_TABLE WHERE GROUPID = 11) c11
INNER JOIN (SELECT ID FROM WORK_TABLE WHERE GROUPID = 12) c12
ON c12.ID = c11.ID
LEFT OUTER JOIN (SELECT ID FROM WORK_TABLE WHERE GROUPID NOT IN (11, 12)) co
ON co.ID = c11.ID
WHERE co.ID IS NULL;
The INNER JOIN between the first two subqueries ensures that rows exist for both GROUPID 11 and 12, and the LEFT OUTER JOIN and WHERE verify that there are no rows for any other GROUPIDs.
dbfiddle here

Select non existing Numbers from Table each ID

I‘m new in learning TSQL and I‘m struggling getting the numbers that doesn‘t exist in my table each ID.
Example:
CustomerID Group
1 1
3 1
6 1
4 2
7 2
I wanna get the ID which does not exist and select them like this
CustomerID Group
2 1
4 1
5 1
5 2
6 2
....
..
The solution by usin a cte doesn‘t work well or inserting first the data and do a not exist where clause.
Any Ideas?

If you can live with ranges rather than a list with each one, then an efficient method uses lead():
select group_id, (customer_id + 1) as first_missing_customer_id,
(next_ci - 1) as last_missing_customer_id
from (select t.*,
lead(customer_id) over (partition by group_id order by customer_id) as next_ci
from t
) t
where next_ci <> customer_id + 1

Cross join 2 recursive CTEs to get all the possible combinations of [CustomerID] and [Group] and then LEFT join to the table:
declare #c int = (select max([CustomerID]) from tablename);
declare #g int = (select max([Group]) from tablename);
with
customers as (
select 1 as cust
union all
select cust + 1
from customers where cust < #c
),
groups as (
select 1 as gr
union all
select gr + 1
from groups where gr < #g
),
cte as (
select *
from customers cross join groups
)
select c.cust as [CustomerID], c.gr as [Group]
from cte c left join tablename t
on t.[CustomerID] = c.cust and t.[Group] = c.gr
where t.[CustomerID] is null
and c.cust > (select min([CustomerID]) from tablename where [Group] = c.gr)
and c.cust < (select max([CustomerID]) from tablename where [Group] = c.gr)
See the demo.
Results:
> CustomerID | Group
> ---------: | ----:
> 2 | 1
> 4 | 1
> 5 | 1
> 5 | 2
> 6 | 2

Oracle join to either of multiple columns

I have a RELATION table
NUM1 | NUM2 | NUM3
-- --- -----
1 2 3
2 4 5
3 4 null
3 4 null
and the actual INFO table where NUM is primary key.
NUM | A_LOT_OF_OTHER_INFO
--- --------------------
1 asdff
2 werwr
3 erert
4 ghfgh
5 cvbcb
I want to create a view to see the count of the NUM that appeared in any of the NUM1, NUM2, NUM3 of the RELATION table.
MY_VIEW
NUM | A_LOT_OF_OTHER_INFO | TOTAL_COUNT
--- -------------------- ------------
1 asdff 1
2 werwr 2
3 erert 3
4 ghfgh 3
5 cvbcb 1
I can do this by doing three selects from RELATION table and UNION them, but I do not want to use UNION because the tables have a lot of records, MY_VIEW is already large enough and I am looking for a better way to join to the RELATION table in the view. Can you suggest a way?

What i would try is to unpivot the relation table.
After that join the info table on the values and count the number of times the val gets repeated.
create table relation(num1 int,num2 int, num3 int);
insert into relation values(1,2,3);
insert into relation values(2,4,5);
insert into relation values(3,4,null);
create table info(num int, a_lot_of_other_info varchar2(100));
insert into info
select 1,'asdff' from dual union all
select 2,'werwr' from dual union all
select 3,'erert' from dual union all
select 4,'ghfgh' from dual union all
select 5,'cvbcb' from dual
select a.num
,max(a_lot_of_other_info) as a_lot_of_other_info
,count(*) as num_of_times
from info a
join (select val
from relation a
unpivot(val for x in (num1,num2,num3))
)b
on a.num=b.val
group by a.num
order by 1

I would suggest a correlated subquery:
select i.*,
(select ((case when r.num1 = i.num then 1 else 0 end) +
(case when r.num2 = i.num then 1 else 0 end) +
(case when r.num3 = i.num then 1 else 0 end)
)
from relation r
where i.num in (r.num1, r.num2, r.num3)
) as total_count
from info i;
If performance is a consideration, it might be faster to use left joins:
select i.*,
((case when r1.num1 is not null then 1 else 0 end) +
(case when r2.num1 is not null then 1 else 0 end) +
(case when r3.num1 is not null then 1 else 0 end)
) as total_count
from info i left join
relation r1
on i.num = r1.num1 left join
relation r2
on i.num = r2.num2 left join
relation r3
on i.num = r3.num3;
In particular, this will make optimal use of three separate indexes on relation: relation(num1), relation(num2), and relation(num3).

It seems what you want is UNPIVOT. Perhaps easiest to do with a cross join in this case:
select NUM, count(*) as TOTAL_COUNT
from (
select decode(column_value, 1, NUM1, 2, NUM2, 3, NUM3) as NUM
from RELATION cross join table(sys.odcinumberlist(1,2,3))
)
group by NUM
;
Then join this to the second table; the join part is really irrelevant here.

DB2 SQL better way to get count of unique values based on ID than SUM+CASE

I am able to count the occurrences of unique values per ID in the same record, but it seems there must be a more efficient way? Something like COUNT([Value],'2')?
Here's a simple example
ID | Value
1 2
1 3
1 3
1 2
2 2
2 3
2 3
3 3
And this is my current code:
SELECT ID, SUM(CASE WHEN Value = '2' THEN 1 ELSE 0 END) AS "COUNT2",
SUM(CASE WHEN Value = '2' THEN 1 ELSE 0 END) AS "COUNT3"
FROM TABLE
GROUP BY ID
The results are:
ID | Count2 | Count3
1 2 3
2 1 2
3 0 1
Is there a better way to get the count of unique values?

Try:
Select distinct Id, Count2, Count3
from Table
Outer Apply (select count(id) as Count2 from table t
where t.id = Table.id and value = 2) c2
Outer Apply (select count(id) as Count3 from table t
where t.id = Table.id and value = 3) c3
Order by Id asc
Typed from my phone so may need to be tweaked a little but something like this should work

You could use DECODE if by "better" you mean more terse
WITH I (ID, V) AS (VALUES
(1,2)
, (1,3)
, (1,3)
, (1,2)
, (2,2)
, (2,3)
, (2,3)
, (3,3)
)
SELECT
ID
, COUNT(DECODE(V,2,1)) AS "Count2"
, COUNT(DECODE(V,3,1)) AS "Count3"
FROM I
GROUP BY
ID
returns
ID Count2 Count3
-- ------ ------
1 2 2
2 1 2
3 0 1

If you are using Db2 LUW 11.1.1.1 onward, you could take advantage of the fact that you can SUM BOOLEAN values.
WITH I (ID, V) AS (VALUES
(1,2)
, (1,3)
, (1,3)
, (1,2)
, (2,2)
, (2,3)
, (2,3)
, (3,3)
)
SELECT
ID
, SUM(V=2) AS "Count2"
, SUM(V=3) AS "Count3"
FROM I
GROUP BY
ID

Find Common Rows for some Row Values in SQL

I have a table with Ids and a subId column. And I have a user defined data type with a list of SubIds. I want all those ids which have all the sub-ids present in my user-defined data type. for example:
The table is:
ID SubID
1 2
1 3
1 4
2 3
2 4
2 2
3 3
3 2
and the data type is
CREATE TYPE SubIds AS TABLE
( SubId INT );
GO
With Value
SubID
3
4
I want the output to be
ID
1
2
Because only the ID 1 and 2 contain both the subIds 3 & 4
Note: the combination of Id and Sub ID will always be unique if its of any use

Let's assume that #s is your table of ids:
select t.ID
from t
Where t.SubId in (select SubId from #s)
group by t.Id
having count(*) = (select count(*) from #s);
This assumes that the two tables do not have duplicates. If duplicates are present, you can use:
select t.ID
from t
Where t.SubId in (select SubId from #s)
group by t.Id
having count(distinct t.SubId) = (select count(distinct s.SubId) from #s s);

Try this way
select ID
from yourtable
Where SubID in (3,4)
Group by ID
having Count(distinct SubID)=2
Another more flexible approach
select ID
from yourtable
Group by ID
having sum(case when SubID = 3 then 1 else 0 end) >= 1
and sum(case when SubID = 4 then 1 else 0 end) >= 1
If you want to pull SubId's from SubIds table type then,
SELECT ID
FROM yourtable T
JOIN (SELECT SubID,
Count(1) OVER() AS cnt
FROM SubIds) S
ON T.SubID = S.SubID
GROUP BY ID,Cnt
HAVING Count(DISTINCT T.SubID) = s.cnt

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Filter duplicates on cross fields - sql

Related

Oracle SQL: How to select only ID‘s which are member in specific groups?

Select non existing Numbers from Table each ID

Oracle join to either of multiple columns

DB2 SQL better way to get count of unique values based on ID than SUM+CASE

Find Common Rows for some Row Values in SQL

Categories

Resources