Find exactly equal rows in 2 tables, both in terms of value and number - sql

I have two Table, that both of them have 2 field (provinceid,cityid)
i want to find provinceid that have exactly the same cityid in this two table.
for example i have this tables:
table1:
provinceid
cityid
1
1
1
2
2
3
2
4
3
6
table2:
provinceid
cityid
1
1
1
5
2
3
2
4
3
6
3
7
i want a query that just return provinceid =2 and city id =3 and 4.
i try this query and it is right. but i want a better query:
select provinceid ,t1.cityid
from t1
left join t2 on t1=provinceid=t2.provinceid and t1.cityid=t2.cityid
where t2.provinceid is not null and t2.cityid is not null
and t1.provinceid not in (select provinceid
from t2
left join t1 on t1=provinceid=t2.provinceid and t1.cityid=t2.cityid
where t1.provinceid is not null and t1.cityid is not null)
thank you

Try this :
select t1.provinceid ,t1.cityid
from table1 t1 join table2 t2
on t1.provinceid=t2.provinceid
and t1.cityid=t2.cityid
and t1.provinceid in (
select distinct(t1.provinceid)
from
(select provinceid, count(provinceid) as cnt from table1 group by provinceid) as t1
cross join
(select provinceid ,count(provinceid) as cnt from table2 group by provinceid) as t2
where t1.cnt = t2.cnt);
Output:
provinceid
cityid
1
1
2
3
2
4

The simplest method for an exact match is to use string aggregation. The exact syntax varies by database, but in Standard SQL this looks like:
select t1.provinceid, t2.provinceid
from (select provinceid,
listagg(cityid, ',') within group (order by cityid) as cities
from t1
group by provinceid
) t1 join
(select provinceid,
listagg(cityid, ',') within group (order by cityid) as cities
from t2
group by provinceid
) t2
on t1.cities = t2.cities;
If you want the provinceids to be the same as well, just add t1.provinceid = t2.provinceid to the on clause.
Or, if you want the provinceids to be the same, you can use full join instead:
select provinceid
from t1 full join
t2
using (provinceid, cityid)
group by provinceid
having count(*) = count(t1.cityid) and count(*) = count(t2.cityid);

Besides match in provid and cityid, we are looking for exactly matching sets of records as well. There might be many different methods to this. I prefer to have string comparison for list of cities for each provide with addition to provide and cityid match clause to remove other sets of provide and cityid which are available in tables but not the exact row match.
WITH table1 AS(
SELECT 1 AS PROVID, 1 AS CITYID FROM DUAL UNION ALL
SELECT 1 AS PROVID, 2 AS CITYID FROM DUAL UNION ALL
SELECT 2 AS PROVID, 3 AS CITYID FROM DUAL UNION ALL
SELECT 2 AS PROVID, 4 AS CITYID FROM DUAL UNION ALL
SELECT 3 AS PROVID, 6 AS CITYID FROM DUAL
),
table2 AS (
SELECT 1 AS PROVID, 1 AS CITYID FROM DUAL UNION ALL
SELECT 1 AS PROVID, 5 AS CITYID FROM DUAL UNION ALL
SELECT 2 AS PROVID, 3 AS CITYID FROM DUAL UNION ALL
SELECT 2 AS PROVID, 4 AS CITYID FROM DUAL UNION ALL
SELECT 3 AS PROVID, 6 AS CITYID FROM DUAL UNION ALL
SELECT 3 AS PROVID, 7 AS CITYID FROM DUAL
),
listed_table1 AS (
SELECT
a.provid,
listagg(cityid,',') within GROUP (ORDER BY cityid) list_city
FROM table1 a
GROUP BY a.provid
),
listed_table2 AS (
SELECT
a.provid,
listagg(cityid,',') within GROUP (ORDER BY cityid) list_city
FROM table2 a
GROUP BY a.provid
)
SELECT
t1.provid, t1.cityid
FROM
(SELECT x.*, x1.list_city FROM table1 x, listed_table1 x1 WHERE x.provid = x1.provid) t1,
(SELECT y.*, y1.list_city FROM table2 y, listed_table2 y1 WHERE y.provid = y1.provid) t2
WHERE t1.provid = t2.provid AND t1.cityid = t2.cityid AND t1.list_city = t2.list_city
;

You can use (union ..)except (inner join..) to detect non-matches. Step by step
with u12 as (
select PROVID, CITYID from table1
union
select PROVID, CITYID from table2
),
c12 as (
select t1.PROVID, t2.CITYID
from table1 t1
join table2 t2 on t1.PROVID=t2.PROVID and t1.CITYID=t2.CITYID
),
nonMatch as (
select distinct PROVID
from (
select PROVID, CITYID from u12
except
select PROVID, CITYID from c12
) t
)
select *
from table1 t
where not exists (
select 1
from nonMatch n
where n.PROVID = t.PROVID);
If a number of doubles counts then count them first
with t1 as (
select PROVID, CITYID, count(*) n
from table1
group by PROVID, CITYID
),
t2 as (
select PROVID, CITYID, count(*) n
from table2
group by PROVID, CITYID
),
u12 as (
select PROVID, CITYID, n from t1
union
select PROVID, CITYID, n from t2
),
c12 as (
select t1.PROVID, t1.CITYID, t1.n
from t1
join t2 on t1.PROVID = t2.PROVID and t1.CITYID = t2.CITYID and t1.n = t2.n
),
nonMatch as (
select distinct PROVID
from (
select PROVID, CITYID, n from u12
except
select PROVID, CITYID, n from c12
) t
)
select *
from table1 t
where not exists (
select 1
from nonMatch n
where n.PROVID = t.PROVID)
db<>fiddle

Related

SQL query to get both common and and non common data from 2 tables

Hi im looking for a query which will give me both common and non-common data in one query.
Table 2
ID
Assay
1
124
Result
required_missing
required_present
125
124
Based on req_ind column from table 1 , if req_ind is 1 and the same assay is present in table 2 i want to list it as above.
required missing column can have multiple column.
With the data given this gives requested result:
WITH table1 as (
select 1 as ID, 123 as Assay, 0 as req_ind from dual
union all
select 2,124,1 from dual
union all
select 3,125,1 from dual
),
table2 as (
select 1 as ID, 124 as Assay from dual
),
required_missing as (
select
row_number() over (order by table1.Assay) as R,
table1.Assay as required_missing
from table1
left join table2 on table2.Assay = table1.Assay
where table1.req_ind=1 and table2.id is null
),
requires_present as (
select
row_number() over (order by table1.Assay) as R,
table1.Assay as required_present
from table1
left join table2 on table2.Assay = table1.Assay
where table1.req_ind=1 and table2.id is not null
),
results as (
select row_number() over (order by (id)) as r
from table1
)
select rm.required_missing, rp.required_present
from results
left join required_missing rm on rm.R = results.R
left join requires_present rp on rp.R = results.R
where rm.R is not null or rp.R is not null;
output:
REQUIRED_MISSING
REQUIRED_PRESENT
125
124
If you want to have a comma separated list for missing and for present then you can use:
SELECT LISTAGG(CASE WHEN t2.assay IS NULL THEN t1.assay END, ',')
WITHIN GROUP (ORDER BY t1.assay) AS required_missing,
LISTAGG(t2.assay, ',')
WITHIN GROUP (ORDER BY t1.assay) AS required_present
FROM table1 t1
LEFT OUTER JOIN table2 t2
ON (t1.assay = t2.assay)
WHERE t1.req_ind = 1
Which, for the sample data:
CREATE TABLE table1 (id, assay, req_ind) AS
SELECT 1, 123, 0 FROM DUAL UNION ALL
SELECT 2, 124, 1 FROM DUAL UNION ALL
SELECT 3, 125, 1 FROM DUAL UNION ALL
SELECT 4, 126, 1 FROM DUAL UNION ALL
SELECT 5, 127, 1 FROM DUAL;
CREATE TABLE table2 (id, assay) AS
SELECT 1, 124 FROM DUAL UNION ALL
SELECT 2, 127 FROM DUAL;
Outputs:
REQUIRED_MISSING
REQUIRED_PRESENT
125,126
124,127
If you want the output in multiple rows then:
SELECT required_missing,
required_present
FROM (
SELECT NVL2(t2.assay, 'P', 'M') AS status,
ROW_NUMBER() OVER (
PARTITION BY NVL2(t2.assay, 'P', 'M')
ORDER BY t1.assay
) AS rn,
t1.assay
FROM table1 t1
LEFT OUTER JOIN table2 t2
ON (t1.assay = t2.assay)
WHERE t1.req_ind = 1
)
PIVOT (
MAX(assay)
FOR status IN (
'M' AS required_missing,
'P' AS required_present
)
)
Which outputs:
REQUIRED_MISSING
REQUIRED_PRESENT
125
124
126
127
db<>fiddle here

Complex join in sql with top 10 row

Table1:
Id Word Frequency
1 A 1
2 B 5
Table2:
Id Word SecondWord SecondFrequency
1 A A1 1
2 A A2 5
3 A A3 10
4 A A4 9
5 A A5 20
6 B B1 5
7 B B2 8
8 B B3 50
9 B B4 40
10 B B5 68
Required output
Top 3 record from “Table2” with Order by SecondFrequency Desc
Ex.
Word Frequency SecondWord SecondFrequency
A 1 A5 20
A 1 A3 10
A 1 A4 9
B 5 B5 68
B 5 B3 50
B 5 B4 40
How can i get the desire output
Use ROWNUMBER function based on second frequency for get you required result:
CREATE TABLE #Table1(Id TINYINT, Word VARCHAR(1),Frequency TINYINT)
CREATE TABLE #Table2(Id TINYINT, Word VARCHAR(1),SecondWord
VARCHAR(2),SecondFrequency TINYINT)
INSERT INTO #Table1(Id, Word ,Frequency)
SELECT 1,'A',1 UNION ALL
SELECT 2,'B',5
INSERT INTO #Table2(Id, Word ,SecondWord ,SecondFrequency)
SELECT 1,'A','A1',1 UNION ALL
SELECT 2,'A','A2',5 UNION ALL
SELECT 3,'A','A3',10 UNION ALL
SELECT 4,'A','A4',9 UNION ALL
SELECT 5,'A','A5',20 UNION ALL
SELECT 6,'B','B1',5 UNION ALL
SELECT 7,'B','B2',8 UNION ALL
SELECT 8,'B','B3',50 UNION ALL
SELECT 9,'B','B4',40 UNION ALL
SELECT 10,'B','B5',68
SELECT *
FROM
(
SELECT ROW_NUMBER() OVER(PARTITION BY #Table1.Word ORDER BY
SecondFrequency DESC ) RNo ,#Table1.Word ,#Table1.Frequency,
SecondWord ,SecondFrequency
FROM #Table1
JOIN #Table2 ON #Table1.Word = #Table2.Word
) A
WHERE RNo BETWEEN 1 AND 3
you can use Row Number. By using Row Number you can give each row with the same 'word' a number based on their SecondFrequency. those number will be reset if the 'word' is changed.
;with cte as
(
select *, ROW_NUMBER() OVER (PARTITION BY Word ORDER BY SecondFrequency DESC) AS RowNumber from table2
)
select A.Word, B.Frequency, A.SecondWord, A.SecondFrequency
from cte A left join table1 B
on A.Word = B.Word
where A.RowNumber < 4
Inner Join with Row_Number() will help in this case !!!
CREATE TABLE #Table1
(
Id INT
,Word VARCHAR(10)
,Frequency INT
)
INSERT INTO #Table1 SELECT 1,'A',1
UNION SELECT 2,'B',5
CREATE TABLE #Table2
(
Id INT
,Word VARCHAR(10)
,SecondWord VARCHAR(10)
,SecondFrequency INT
)
INSERT INTO #Table2 SELECT
1,'A','A1',1 UNION ALL SELECT
2,'A','A2',5 UNION ALL SELECT
3,'A','A3',10 UNION ALL SELECT
4,'A','A4',9 UNION ALL SELECT
5,'A','A5',20 UNION ALL SELECT
6,'B','B1',5 UNION ALL SELECT
7,'B','B2',8 UNION ALL SELECT
8,'B','B3',50 UNION ALL SELECT
9,'B','B4',40 UNION ALL SELECT
10,'B','B5',68
SELECT * FROM #Table1
SELECT * FROM #Table2
SELECT X.Word,X.Frequency,X.SecondWord,X.SecondFrequency
FROM
(SELECT T1.Word,T1.Frequency,T2.SecondWord,T2.SecondFrequency,ROW_NUMBER() OVER(PARTITION BY T1.WORD ORDER BY T2.SecondFrequency desc) as RN
FROM #Table1 T1
JOIN #Table2 T2
ON T1.Word = T2.Word
) AS X
WHERE X.RN<=3
get the top 3 rows from Table_2
join the Table_1
the syntax is : ROW_NUMBER() OVER(PARTITION BY COL1 ORDER BY COL2) AS num
COL1 is the column to group and COL2 is the column to sort , num is the sorted number to be used to limit the results
SELECT t2.Word,
t1.Frequency,
t2.SecondWord,
t2.SecondFrequency
FROM
(SELECT *
FROM
(SELECT Word,
SecondWord,
SecondFrequency,
ROW_NUMBER() over(PARTITION BY Word
ORDER BY SecondFrequency DESC) AS num
FROM Table_2) T
WHERE T.num <= 3 ) t2
JOIN Table_1 AS t1 ON t2.Word = t1.Word
ORDER BY t2.SecondFrequency DESC;

SQL query to return data only if ALL necessary columns are present and not NULL

ID | Type | total
1 Purchase 12
1 Return 2
1 Exchange 5
2 Purchase null
2 Return 5
2 Exchange 1
3 Purchase 34
3 Return 4
3 Exchange 2
4 Purchase 12
4 Exchange 2
Above is sample data. What I want to return is:
ID | Type | total
1 Purchase 12
1 Return 2
1 Exchange 5
3 Purchase 34
3 Return 4
3 Exchange 2
So if a field is null in total or the values of Purchase, Return and Exchange are not all present for that ID, ignore that ID completely. How can I go about doing this?
You can use exists. I think you intend:
select t.*
from t
where exists (select 1
from t t2
where t2.id = t.id and t2.type = 'Purchase' and t2.total is not null
) and
exists (select 1
from t t2
where t2.id = t.id and t2.type = 'Exchange' and t2.total is not null
) and
exists (select 1
from t t2
where t2.id = t.id and t2.type = 'Return' and t2.total is not null
);
There are ways to "simplify" this:
select t.*
from t
where 3 = (select count(distinct t2.type)
from t t2
where t2.id = t.id and
t2.type in ('Purchase', 'Exchange', 'Return') and
t2.total is not null
);
I would write this as a join, without subqueries:
SELECT pur.id, pur.total AS Purchase, exc.total AS Exchange, ret.total AS Return
FROM MyTable as pur
INNER JOIN MyTable AS exc ON exc.id=pur.id AND exc.type='Exchange'
INNER JOIN MyTable AS ret ON ret.id=pur.id AND ret.type='Return'
WHERE pur.type='Purchase'
The inner join means that if any of the three rows with different values are not found for a given id, then no row is included in the result.
Analytic functions are a good way to solve this kind of problems. The base table is read just once, and no joins (explicit or implicit, as in EXISTS conditions or correlated subqueries) are needed.
In the solution below, we count distinct values of 'Purchase', 'Exchange' and 'Return' for each id while ignoring other values (assuming that is indeed the requirement), and separately count total nulls in the total column for each id. Then it becomes a trivial matter to select just the "desired" rows in an outer query.
with
test_data ( id, type, total ) as (
select 1, 'Purchase', 12 from dual union all
select 1, 'Return' , 2 from dual union all
select 1, 'Exchange', 5 from dual union all
select 2, 'Purchase', null from dual union all
select 2, 'Return' , 5 from dual union all
select 2, 'Exchange', 1 from dual union all
select 3, 'Purchase', 34 from dual union all
select 3, 'Return' , 4 from dual union all
select 3, 'Exchange', 2 from dual union all
select 4, 'Purchase', 12 from dual union all
select 4, 'Exchange', 2 from dual
)
-- end of test data; actual solution (SQL query) begins below this line
select id, type, total
from ( select id, type, total,
count( distinct case when type in ('Purchase', 'Return', 'Exchange')
then type end
) over (partition by id) as ct_type,
count( case when total is null then 1 end
) over (partition by id) as ct_total
from test_data
)
where ct_type = 3 and ct_total = 0
;
Output:
ID TYPE TOTAL
-- -------- -----
1 Exchange 5
1 Purchase 12
1 Return 2
3 Exchange 2
3 Purchase 34
3 Return 4
This also should work fine even if new values are added to type column
select * from t where
ID not in(select ID from t where
t.total is null or t.[Type] is null)

join 2 tables with SQL

I have to join 2 tables with SQL in a special way:
TABLE1 has the fields GROUP and MEMBER, TABLE2 has the fields GROUP and MASTER.
I have to build a new TABLE3 with the fields GROUP and ID by copying TABLE1 to TABLE3 and search TABLE2 if there is a GROUP from TABLE1 and if, copy GROUP and MASTER to TABLE3.
Example:
table1:
group member
1 a
1 b
1 c
2 x
3 y
table2:
group master
3 n
3 z
1 k
9 v
2 m
7 o
8 p
Expected result, table3:
group id
1 a from table1
1 b from table1
1 c from table1
1 k from table2
2 x from table1
2 m from table2
3 y from table1
3 z from table2
3 n from table2
I hope everything's clear.
So what is the SQL query?
Thanks, Hein
The first part (copy members) should be easy:
INSERT INTO table3 (group, id) SELECT group, member FROM table1;
Then You just copy the masters, that are in groups, that are already present in table1:
INSERT INTO table3 (group, id) SELECT group, master FROM table2 WHERE group IN (SELECT DISTINCT group FROM table1);
Try this out. Of course you need to INSERT the whole selection to your new table named Table3.
WITH TABLE1(GRP,MMBR) AS
(SELECT 1, 'a' FROM DUAL UNION ALL
SELECT 1, 'b' FROM DUAL UNION ALL
SELECT 1, 'c' FROM DUAL UNION ALL
SELECT 2, 'x' FROM DUAL UNION ALL
SELECT 3, 'y' FROM DUAL),
TABLE2(GRP,MSTR) AS
(SELECT 3, 'n' FROM DUAL UNION ALL
SELECT 3, 'z' FROM DUAL UNION ALL
SELECT 1, 'k' FROM DUAL UNION ALL
SELECT 9, 'v' FROM DUAL UNION ALL
SELECT 2, 'm' FROM DUAL UNION ALL
SELECT 7, 'o' FROM DUAL UNION ALL
SELECT 8, 'p' FROM DUAL)
SELECT * FROM (
SELECT GRP, MMBR ID FROM TABLE1
UNION --UNION ALL if you need duplicates
SELECT GRP, MSTR ID FROM TABLE2
WHERE TABLE2.GRP IN (SELECT GRP FROM TABLE1)
)
ORDER BY GRP, ID
You can do it using UNION ALL and 2 simple SELECT in an INSERT as follows:
INSERT INTO table3(group,id)
SELECT group,id FROM table1
UNION ALL
SELECT group,id FROM table2
SELECT * FROM table3;
And if you don't want duplicate values,try this using UNION instead of UNION ALL:
INSERT INTO table3(group,id)
SELECT group,id FROM table1
UNION
SELECT group,id FROM table2
SELECT * FROM table3;

Max rows by group

Current SQL:
select t1.*
from table t1
where t1.id in ('2', '3', '4')
Current results:
id | seq
---+----
3 | 5
2 | 7
2 | 5
3 | 7
4 | 3
Attempt to select maxes:
select t1.*
from table t1
where t1.id in ('2', '3', '4')
and t1.seq = (select max(t2.seq)
from table2 t2
where t2.id = t1.id)
This obviously does not work since I'm using an in list. How can I adjust my SQL to get these expected results:
id | seq
---+----
2 | 7
3 | 7
4 | 3
Group By is your friend:
SELECT
id,
MAX(seq) seq
FROM TABLE
GROUP BY id
EDIT: Response to comment. To get the rest of the data from the table matching the max seq and id just join back to the table:
SELECT t1.*
FROM TABLE t1
INNER JOIN (
SELECT
id
MAX(seq) as seq
FROM TABLE
GROUP BY id
) as t2
on t1.id = t2.id
and t1.seq = t2.seq
EDIT: Gordon and Jean-Francois are correct you can also use the ROW_NUMBER() analytic function to get the same result. You need to check the performance difference for your application (I did not check). Here is an example of that:
SELECT *
FROM (
SELECT ROW_NUMBER() OVER (
PARTITION BY id
ORDER BY seq DESC) as row_num
,*
FROM TABLE
) as TMP
WHERE row_num = 1
This SQL Query will give you max seq from individaul ID.
SELECT t1.*
FROM t1
WHERE t1.id in ('2', '3', '4')
AND NOT EXISTS (
SELECT *
FROM t1 t2
WHERE t2.id = t1.id
AND t2.seq > t1.seq
select *
from table
where (id,seq) in
(
select id,max(seq)
from table
group by id
having id in ('2','3','4')
);
That is if id and/or seq are completely part of the PK of that table.
Here's another example, using the first/last method I mentioned earlier in the comments:
with sd as (select 3 id, 5 seq, 1 dummy from dual union all
select 2 id, 7 seq, 2 dummy from dual union all
select 2 id, 5 seq, 3 dummy from dual union all
select 3 id, 7 seq, 4 dummy from dual union all
select 3 id, 7 seq, 5 dummy from dual union all
select 4 id, 3 seq, 6 dummy from dual)
select id,
max(seq) max_seq,
max(dummy) keep (dense_rank first order by seq desc) max_rows_dummy
from sd
group by id;
ID MAX_SEQ MAX_ROWS_DUMMY
---------- ---------- --------------
2 7 2
3 7 5
4 3 6
The keep (dense_rank first order by ...) bit is requesting to keep the values associated with the rank of 1 in the order list of rows. The max(...) bit is there in case more then one row has a rank of 1; it's just a way of breaking ties.