Set contains other set in SQL - sql

Is anybody to know how to find sets which contain all sets of other sets
table
id elem
1 A
1 C
2 B
2 D
2 A
2 C
3 A
3 E
3 F
4 F
4 F
I want to get this:
id
2
3
In that case id 2 contains (A,C) - all set of id 1
and id 3 contains (F) - all set of id 4
In my query I need to get all id which contains all set (all elements) at least one id.
I will be very grateful. Thank you.

I didn't have much to go off of for this one, but it was interested so I guessed. I'm using SQL Server
Follow along in Rextester
I submit the answer which has nested selects, then I follow through on the step by step logic to clarify what's going on:
Build a quick table
CREATE TABLE a (id int, t varchar(10))
INSERT INTO a (id, t)
VALUES (1,'A'),(1,'B'),(1,'B'),(1,'B'),(2,'A')
,(2,'B'),(2,'C'),(3,'C'),(3,'D'),(4,'A'),(5,'P');
Here is the solution:
select distinct p_id supersets from
(
select e.p_id, e.c_id, count(*) matches from (
select distinct c.id p_id, c.t p_t, d.id c_id, d.t c_t from a c
inner join a d on c.t = d.t and c.id <> d.id) e
group by e.p_id, e.c_id) sup
inner join
(select id, count(distinct t) reqs from a group by id) sub
on sub.id = sup.c_id and sup.matches = sub.reqs;
Here are the logical steps broken out to help explain why I'm doing what I'm doing:
--Step1
--Create a list of matches between (distinct) values where IDs are not the same
select distinct c.id p_id, c.t p_t, d.id c_id, d.t c_t from a c
inner join a d on c.t = d.t and c.id <> d.id;
--Step2
--Create a unique list of parent IDs and their child IDs and the # of distinct matches
--For example 2 has 2 matches with 1 and vice versa
select e.p_id, e.c_id, count(*) matches from (
select distinct c.id p_id, c.t p_t, d.id c_id, d.t c_t from a c
inner join a d on c.t = d.t and c.id <> d.id) e
group by e.p_id, e.c_id;
--Step2a
--Create a sub query to see how many distinct values are in each "Set"
select id, count(distinct t) reqs from a group by id;
Now we put it all together in the join (above, first) to make sure the total # of matches from Parent to Child make up 100% of the child values ie(is a super set)

I think the following does what you want:
select t1.id
from t t1 join
(select t.*, count(*) over (partition by id) as cnt
from t
) t2
on t1.elem = t2.elem and t1.id <> t2.id
group by t1.id, t2.id, t2.cnt
having count(*) = cnt;
This matches each id to each other id based on the elements. If the number of matches equals the count in the second set, then all match -- and you have a superset.
I notice that you have duplicate elements. Let's handle this with a CTE:
with t as (
select distinct id, elem
from t
)
select t1.id
from t t1 join
(select t.*, count(*) over (partition by id) as cnt
from t
) t2
on t1.elem = t2.elem and t1.id <> t2.id
group by t1.id, t2.id, t2.cnt
having count(*) = cnt;

Related

How to select groups which has only the values we want and not select it if it also has other values in SQL

id
code
value
A
cod
2
A
buy
34
A
cod
4
B
cod
44
B
F
23
C
thk
45
C
cod
33
C
F
31
D
cod
22
In this table for example, I want those groups of id which has 'code' column value as ONLY cod or F. so query should return values of id = B and nothing else. ( Not even values with id = C because id=C also has 'thk' in code , not even id= D, and output should have ids with ONLY the mentioned two values)
expected output
id
code
value
B
cod
44
B
F
23
You want all rows for the ID of which not exists a forbidden row:
select id, code, value
from mytable
where not exists
(
select null
from mytable forbidden_row
where forbidden_row.id = mytable.id
and forbidden_row.code not in ('cod', 'F')
);
One of approaches with nested query
SELECT ID,Code, value FROM (
select ID, Code,
(SELECT count(*) FROM TableA a where Code = 'cod' and a.ID = TableA.ID) Cod,
(SELECT count(*) FROM TableA a where Code = 'F' and a.ID = TableA.ID) F,
(SELECT count(*) FROM TableA a where Code not in ('F','cod') and a.ID = TableA.ID) Other,
Value
from TableA
) SOURCE
WHERE Cod <> 0 AND F <> 0 and Other = 0
We can achieve this using CTE. Check this,
-- Split the two record category first, then check cod Or F condition.
WITH Count2 AS (
SELECT id
FROM YourTable
GROUP BY id
HAVING COUNT(id) = 2
),
codORF AS (
SELECT id, code, COUNT(id) FROM YourTable T1
LEFT JOIN Count2 T2 On T1.id = T2.id
WHERE code = 'cod' OR code = 'F'
GROUP BY id, code
Having COUNT(id) = 1
)
-- Finally to take all values
SELECT T1.*
FROM YourTable T1
INNER JOIN codORF T2 ON T1.id = T2.id
with main as (
select *, count(id) over(partition by id order by id) as total_rows
from sample
), next_and_before as (
select *,
COALESCE(lag(code) over(partition by id order by id),lead(code) over(partition by id order by id)) as before_next
from main where total_rows <= 2
)
select * from next_and_before
where lower(trim(concat(code,before_next)))in('codf','fcod','cod','f')
Its a bit of hacky solution:
first you are filtering out all the rows that have less than or equal to 2 rows, since there could be cases where you only have one row per id with a code value = 'f' or 'cod', if you don't want that then simply change the last part to: in ('codf','fcod')
then out of two rows, you are looking at the next and before value and checking if it contains other than 'f' or 'cod'
where clause will filter those out if they exist
Test Results from the link below:
Results of sample data

efficently get nearest id based on counting a column in one-to-many table

I have a relational table (one-to-many) and I need to efficiently get the similarities between the ids giving they associated items. The table its something like this:
id item
1 A2231
1 A2134
2 A2134
2 B2313
...
What I need is to get how many rows are common between all the ids:
a_id b_id count_items
1 2 1
1 3 0
2 1 1
...
I have made a query, but Its o(n2), and it doesn't work because the spool space.
SELECT A.ID AS a_id, B.ID AS b_id, COUNT(B.item) AS count_items
FROM Tab AS A LEFT JOIN Tab AS B --same table
ON (A.item = B.item)
GROUP BY A.ID, B.ID
EDIT:
n_rows ~ 50MM
n_items ~ 100K
n_ids ~ 170K
combinations id/item are unique
It'a there a way to efficiently accomplish this?
Thanks in advance!
I would start by just using an inner join:
SELECT A.ID, B.ID, COUNT(*) AS count_items
FROM Tab A LEFT JOIN
Tab B --same table
ON A.item = B.item
GROUP BY A.ID, B.ID;
Next, if your table has duplicates, then this might work:
with t as (
select distinct id, item
from tab
)
select a.id, b.id, count(*)
from t a join
t b
on a.item = b.item
group by a.id, b.id;
And finally, if you want all pairs of items, then:
with t as (
select distinct id, item
from tab
)
select i1.id, i2.id, count(b.id)
from (select distinct id from tab) i1 cross join
(select distinct id from tab) i2 left join
t a
on t.id = i1.id left join
t b
on b.id = i2.id and a.item = b.item
group by i1.id, i2.id;

SQL Select Between Two tables

I have these two tables:
T1:
ref || Name
===========
1 || A
2 || B
3 || C
4 || D
5 || E
And
T2:
ref || Name
===========
1 || w
2 || x
6 || y
7 || z
I need this result:
Name1 || Name2
==============
A || w
B || x
C || y
D || z
E || NULL
I mean some kind of full outer join, on column ref, that will not produce NULL value until there is not any record.
The priority of join is with the values that have same ref, and if row count of tables are not equal, there are some NULL results
Use UPD: here is how you may combine values by values count from different tables:
with t1_values as (
SELECT
name,
row_number() over (order by ref) as position
FROM #t1
),
t2_values as (
SELECT
name,
row_number() over (order by ref) as position
FROM #t2
)
SELECT
t1_values.name as name1,
t2_values.name as name2
FROM t1_values
left JOIN t2_values on t1_values.position = t2_values.position
This is very complicated. You want rows that match to match. Then you want unmatched rows to match unmatched rows by position, and then everything else.
with matches as (
select distinct t1.ref
from t1
where exists (select 1 from t2 where t2.ref = t1.ref)
),
tt1 as (
select t1.*, m.ref as match_ref,
row_number() over (partition by m.ref order by t1.ref) as alt_ref
from t1 left join
matches m
on t1.ref = m.ref
),
tt2 as (
select t2.*, m.ref as match_ref,
row_number() over (partition by m.ref order by t2.ref) as alt_ref
from t2 left join
matches m
on t2.ref = m.ref
)
select tt1.name, tt2.name
from tt1 left join
tt2
on tt1.match_ref = tt2.match_ref or
(tt1.match_ref is null and tt2.match_ref is null and tt1.alt_ref = tt2.alt_ref);
Here is the idea. For each row in both tables, add two new columns:
match_ref is ref when ref exists in the other table.
alt_ref is an enumerated column for the ref values that do not match.
Once you have these columns, it is possible to join the tables together, by first checking match_ref and then -- if that is not present -- checking alt.ref.
SQL Fiddle does not appear to be working for SQL Server. However, here is an identical Postgres version that does work. Here is a working version using SQL Server (this is identical to the Postgres version).
A look at your Question
Correct me if I am wrong, but you have two tables that have an reference ID column of names that you wish to return results sets....only, you do not say distinct so you might end up with extras.
...some kind of full outer join, on column ref, that will not
produce NULL value until there is not any record..
The priority of the JOIN is with the values that have same ref, and if row count of tables are
not equal, there are some NULL results
Actually, the result is not really deterministic. Anyways, this question is still too vague. However, I think you really just want rows matching columns to appear together and anything not....well, not. But return everything.
So, if you know which side is larger, try this:
WITH C AS (SELECT DENSE_RANK() OVER (PARTITION BY ref ORDER BY NAME DESC) AS ROW_ID
, ref
, Name AS Name1
FROM T1)
SELECT Name1, B.Name2
FROM C
LEFT OUTER JOIN (SELECT DENSE_RANK() (OVER PARTITION BY ref ORDER BY NAME DESC) AS ROW_ID
, ref
, Name AS Name2) B ON B.Row_ID = C.Row_ID AND B.ref = C.ref
Each of the ref columns have a distinct ID to attach with. It is done in consecutive order, so if there is still an issue, well, you can figure that logic out. But I'm sure this will help you tremendously get where you are wanting. :)
Thank you every body and excuse me if I speak badly sometimes, count it on my fatigue because of hours working.
I think I couldn't say my meaning correctly, so I answered my own question. this is not the best answer and it is not optimized, I know, but it works!
SELECT t.ref,
t.NAME AS n1,
t2.NAME AS n2
INTO #tbl1
FROM #T1 AS t
LEFT JOIN #T2 AS t2
ON t2.ref = t.ref
WHERE t2.ref IS NOT NULL
SELECT ROW_NUMBER() OVER(ORDER BY t.NAME) AS rn,
*
INTO #tbl2
FROM #T1 AS t
WHERE t.ref NOT IN (SELECT ref
FROM #tbl1)
SELECT ROW_NUMBER() OVER(ORDER BY t.NAME) AS rn,
*
INTO #tbl3
FROM #T2 AS t
WHERE t.ref NOT IN (SELECT ref
FROM #tbl1)
SELECT n1,
n2
FROM #tbl1
UNION
SELECT t1.NAME AS n1,
t2.NAME AS n2
FROM #tbl2 t1
FULL OUTER JOIN #tbl3 t2
ON t1.rn = t2.rn

MS access multiple query left join

I am creating a table in MS ACCESS using multiple sub query in a way such that 1st table contains the whole set and remaining are based on some another condition - here is my code
SELECT a.id,
a.cnt AS Total_Count,
b.cnt AS Income_Count
INTO data
FROM (SELECT id,
Count(*) AS Cnt
FROM test
GROUP BY id) AS a
LEFT JOIN (SELECT id,
Count(*) AS Cnt
FROM test
WHERE inc > 25
GROUP BY id) AS b
ON a.id = b.id;
It works fine. But when I am putting another sub query its not working
SELECT a.id,
a.cnt AS Total_Count,
b,
b.cnt AS Income_Count,
c.cnt AS EXP_Count
INTO data
FROM (SELECT id,
Count(*) AS Cnt
FROM test
GROUP BY id) AS a
LEFT JOIN (SELECT id,
Count(*) AS Cnt
FROM test
WHERE inc > 25
GROUP BY id) AS b
ON a.id = b.id
LEFT JOIN (SELECT id,
Count(*) AS Cnt
FROM test
WHERE exp > 25
GROUP BY id) AS c
ON a.id = c.id;
Can you please help me on this.
SELECT a.id,
a.cnt AS Total_Count,
b, <----------------- What is this? b is an alias for a table, not a field
b.cnt AS Income_Count,
c.cnt AS EXP_Count
INTO data
FROM (SELECT id,
Count(*) AS Cnt
FROM test
GROUP BY id) AS a
LEFT JOIN (SELECT id,
Count(*) AS Cnt
FROM test
WHERE inc > 25
GROUP BY id) AS b
ON a.id = b.id
LEFT JOIN (SELECT id,
Count(*) AS Cnt
FROM test
WHERE exp > 25
GROUP BY id) AS c
ON a.id = c.id;
Remove the 'b' or replace it with an actual field. Be careful, if you use b.id then you will run into issues with trying to create two columns with the same name 'id' you will have to change the name of one of them. I ran this on a sql server an it worked fine without the 'b'.

Can criteria be re-used in a T-SQL query?

I have a SQL query that looks something like this:
SELECT
o.name,
o.type_id,
(SELECT COUNT(*) FROM a WHERE type_id = o.type_id AND id IN ('1, 2, 3 ... 1000')) AS count_a,
(SELECT COUNT(*) FROM b WHERE type_id = o.type_id AND id IN ('1, 2, 3 ... 1000')) AS count_b,
(SELECT COUNT(*) FROM c WHERE type_id = o.type_id AND id IN ('1, 2, 3 ... 1000')) AS count_c
FROM o
In the subqueries (count_a, count_b and count_c) the criteria specified in the IN clause is the same for each, but its a REALLY long list of numbers (that aren't in fact sequential) and im concerned that:
a) Im slowing the query down by making it too long
b) Its going to get too long and cause an error eventually
Is there a way to alias/reference that list of criteria (perhaps as a variable?) so that it can be re-used in each of the three places it appears in the query? Or am I worrying for nothing?
UPDATE
Given the suggestion of using a CTE, I have changed the query above to work like this for now:
WITH id_list AS (SELECT id FROM source WHERE id IN ('1, 2, 3 ... 1000'))
SELECT
o.name,
o.type_id,
(SELECT COUNT(*) FROM a WHERE type_id = o.type_id AND id IN (SELECT id FROM id_list)) AS count_a,
(SELECT COUNT(*) FROM b WHERE type_id = o.type_id AND id IN (SELECT id FROM id_list)) AS count_b,
(SELECT COUNT(*) FROM c WHERE type_id = o.type_id AND id IN (SELECT id FROM id_list)) AS count_c
FROM o
This cuts the overall length of the query down to about a third of what it was, and although the DB appears to take a couple of milliseconds longer to execute the query, at least I wont run into an error based on the length of query being too long.
QUESTION: Is there a quick way to break a comma separated list of numbers (1, 2, 3 ... 1000) into a result set that could be used as the CTE?
You could use a common-table-expression(CTE):
WITH Numbers AS
(
SELECT N
FROM (SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 1000)
AS T(N)
)
SELECT
o.name,
o.type_id,
(SELECT COUNT(*) FROM a WHERE type_id = o.type_id AND id IN (SELECT N FROM Numbers)) AS count_a,
(SELECT COUNT(*) FROM b WHERE type_id = o.type_id AND id IN (SELECT N FROM Numbers)) AS count_b,
(SELECT COUNT(*) FROM c WHERE type_id = o.type_id AND id IN (SELECT N FROM Numbers)) AS count_c
FROM o
If your ids are foreign keys into some other table, then you could make a table variable from your list, and then join into that repeatedly.
DECLARE #id_list TABLE (
id INT
)
INSERT INTO #id_list
SELECT id
FROM pk_location
WHERE id IN ('1, 2, 3 ... 1000')
SELECT
o.name,
o.type_id,
(SELECT COUNT(*) FROM a INNER JOIN #id_list i ON a.id = i.id WHERE type_id = o.type_id) AS count_a,
(SELECT COUNT(*) FROM b INNER JOIN #id_list i ON b.id = i.id WHERE type_id = o.type_id) AS count_b,
(SELECT COUNT(*) FROM c INNER JOIN #id_list i ON c.id = i.id WHERE type_id = o.type_id) AS count_c
FROM o