Selecting both matching Rows from 2 tables - sql

I am linking two tables and I want matching rows from both tables displayed as separate rows in the output table.
Example:
Table 1
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
Table 2;
2 WWW
4 XXX
5 YYY
6 ZZZ
7 UUU
Output:
2 BBB
2 WWW
4 DDD
4 XXX
5 EEE
5 YYY

There are many ways to do this (combining two sets as one) but the first that comes to mind is to use a unionstatement:
SELECT Column1, Column2
FROM Table1
WHERE Column1 IN (SELECT Column1 FROM Table2)
UNION ALL -- union all returns all rows including duplicates
-- union without all returns all but no duplicates
SELECT Column1, Column2
FROM Table2
WHERE Column1 IN (SELECT Column1 FROM Table1)
ORDER BY Column1, Column2
Sample SQL Fiddle
In the example above I used INoperator with a subquery to determine what rows should be returned in each set; another, and potentially better performing option would be to use joins to limit the members of each set like this:
SELECT Column1, Column2
FROM Table1
INNER JOIN Table2 ON Table1.Column1 = Table2.Column1
UNION ALL
SELECT Column1, Column2
FROM Table2
INNER JOIN Table1 ON Table1.Column1 = Table2.Column1
ORDER BY Column1, Column2
The general concept with the union statement is that you form two similar (in that they have the same columns with the same data types) sets and then use the unionoperator to merge them together as one, optionally including duplicate rows by addingallto the operator.

Related

Putting one SQL table on top of another using oracle SQL

This is probably a very simple question but I cannot find the solution. I have two tables with identical column names and I wish to put one on top of the other. I have tried UNION but this appears not to work and I get the error 'ORA-01790: expression must have same datatype as corresponding expression'
I am using Oracle SQL Developer to access the data.
table1 =
column1
column2
column3
1111111
2222222
3333333
aaaaaaa
bbbbbbb
ccccccc
table2 =
column1
column2
column3
9999999
8888888
7777777
zzzzzzz
yyyyyyy
xxxxxxx
desired output
column1
column2
column3
1111111
2222222
3333333
aaaaaaa
bbbbbbb
ccccccc
9999999
8888888
7777777
zzzzzzz
yyyyyyy
xxxxxxx
I have tried the following script to get it - any assistance would be appreciated.
select * from table1
union
select * from table2
The data type problem might be fixed by explicitly listing out all columns in the select clause. In addition, you should introduce a computed column which maintains the order of the two halves of the union in the output.
SELECT column1, column2, column3
FROM
(
SELECT column1, column2, column3, 1 AS src
FROM table1
UNION ALL
SELECT column1, column2, column3, 2
FROM table2
) t
ORDER BY src;

How to get distinct count over multiple columns in SQL?

I have a table that looks like this. And I want to get the distinct count across the three columns.
ID
Column1
Column 2
Column 3
1
A
B
C
2
A
A
B
3
A
A
The desired output I'm looking for is:
ID
Column1
Column 2
Column 3
unique_count
1
A
B
C
3
2
A
A
B
2
3
A
A
1
You want to use cross apply for this one.
select *
from t cross apply
(select count(distinct cnt) as unique_count
from (values(Column1),(Column2),(Column3)) t(cnt)) t2
ID
Column1
Column2
Column3
unique_count
1
A
B
C
3
2
A
A
B
2
3
A
A
1
Fiddle
In standard SQL, you have to UNPIVOT first, do the count(distinct) group by on the PIVOTed result and then PIVOT again.
In recent ORACLE version, you could write your own Polymorphic Table Function to do it, passing the table and the list of columns to count for DISTINCT values.
try if this works
SELECT Count(*)
FROM (SELECT DISTINCT Column1 FROM TABLENAME);

Query to get specific value between two table

I have two table which contains two different primary key, lets call them, table1 and table2.
The tables may have the same number of columns.
Table1:
ID
NOM
CODE
1
AAA
661YYYDD
2
BBB
YYYD661
3
CCC
YD661
4
DDD
P5500Z
Table 2:
ID
KEYCODE
1
661
2
55
I want to be able to get by KEYCODE:
ALL record in table1 which contain 661 or 55. For example when I select by 661 I get only the first 3 rows from tables1.
This works as well:
SELECT *
FROM TABLE1
JOIN TABLE2
ON TABLE1.CODE LIKE '%'||TABLE2.KEYCODE||'%'
WHERE TABLE2.KEYCODE = '661'
dbfiddle

Finding rows with same values in two columns

I have 3 columns in Postgres database having table mytable and i want records having only duplicate values in 2nd and 3rd column.
SQL> select * from mytable ;
column1 column2 column3
A 1 50----required output ie. 50 is duplicate in line B column 2
B 50 3 ----required output ie. 50 is duplicate in line A column 3
C 2 10----values are duplicate in others lines
D 30 70----required output ie. 30 is duplicate in line E column 3
E 8 30----required output ie. 30 is duplicate in line D column 2
F 40 25----values are not duplicate in others lines
I want the following output with count(*):
column1 column2 column3
A 1 50
B 50 3
D 30 70
E 8 30
Here is an example of a self join to handle this:
select distinct m.*
from mytable m
inner join mytable m2
on (
m.column2 in (m2.column2, m2.column3)
or m.column3 in (m2.column2, m2.column3)
)
and m.column1 <> m2.column1
I would use exists:
select t.*
from mytable t
where exists (select 1
from mytable t2
where t.col2 in (t2.col2, t2.col3) or
t.col3 in (t2.col2, t2.col3)
);

Get ID pairs between 2 tables with matching child records

I have 2 tables with the same structure.
FIELD 1 INT
FIELD 2 VARCHAR(32) -- is a MD5 Hash
The query has to get matching FIELD 1 pairs from for records that have the exact combination of values for FIELD 2 in both TABLE 1 and TABLE 2.
These tables are pretty large ( 1 million records between the two ) but are deduced down to an ID and a Hash.
Example data:
TABLE 1
1 A
1 B
2 A
2 D
2 E
3 G
3 H
4 E
4 D
4 C
5 E
5 D
TABLE 2
8 A
8 B
9 E
9 D
9 C
10 F
11 G
11 H
12 B
12 D
13 A
13 B
14 E
14 A
The results of the query should be
8 1
9 4
11 3
13 1
I have tried creating a concatenated string of FIELD 2 using a correlated sub-query and FOR XML PATH string trick I read on here but that is very slow.
You can try following query also -
SELECT t_2.Field_1, t_1.Field_1 --1
FROM table_1 t_1, table_2 t_2 --2
WHERE t_1.Field_2 = t_2.Field_2 --3
GROUP BY t_1.Field_1, t_2.Field_1 --4
HAVING COUNT(*) = (SELECT COUNT(*) --5
FROM Table_1 t_1_1 --6
WHERE t_1_1.Field_1 = t_1.Field_1) --7
AND COUNT(*) = (SELECT COUNT(*) --8
FROM Table_2 t_2_1 --9
WHERE t_2_1.Field_1 =t_2.Field_1) --10
Edit
First the requested set of result is the combination of Field1 from both the tables where respective Field2 is exactly same.
so for that you can use one method which I have posted above.
Here
query will take the data from both the table based on field2 values (from line 1 to line 3)
then it will group the data based on field1 from table1 and field1 from table2 (line 4)
till this step you will get the result having field1 from table1 and field2 from table2 where it exists (at least one) matching based on field2 from tables for respective field1 values.
after this you just need to filter the result for correct (exactly same values for field2 values for respective field1 column value). so that you can make condition on row count.
here my assumption is that you don't have multiple values for field1 and field2 combination in either tables
means following rows will not be present -
1 b
1 b
In any of the tables.
if so, the rows count got for table1 and table2 for same field2 values should be match with the rows present in table1 for field1 and same rows only should present in tables2 for field2 value.
for this condition query has condition on count(*) in having clause (from line 5 to line 10).
Let me try to explain this version of the query:
select t1.field1 as t1field1, t2.field1 as t2field1
from (select t1.*,
count(*) over (partition by field1) as NumField2
from table1 t1
) t1 full outer join
(select t2.*,
count(*) over (partition by field1) as NumField2
from table2 t2
) t2
on t1.field2 = t2.field2
where t1.NumField2 = t2.NumField2
group by t1.Field1, t2.Field1
having count(t1.field2) = max(t1.NumField2) and
count(t2.field2) = max(t2.NumField2)
(which is here at SQLFiddle).
The idea is to compare the following counts for each pair of field1 values.
The number of field2 values on each.
The number of field2 values that they share.
All of these have to be equal.
Each subquery counts the number of values of field2 on each field1 value. For the first rows of your data, this produces:
1 A 2
1 B 2
2 A 3
2 D 3
2 E 3
. . .
And for the second table
8 A 2
8 B 2
9 E 3
9 D 3
9 C 3
Next, the full outer join is applied, requiring a match on both the count and the field2 value. This multiplies the data, producing rows such as:
1 A 2 8 A 2
1 B 2 8 B 2
2 A 3 NULL NULL NULL
2 D 3 9 D 3
2 E 3 9 E 3
NULL NULL NULL 9 C 3
And so on for all the possible combinations. Note that the NULLs appear due to the full outer join.
Note that when you have a pair, such as 1 and 8 that match, there are no rows with NULL values. When you have a pair with the same counts but they don't match, then you have NULL values. When you have a pair with different counts, they are filtered out by the where clause.
The filtering aggregation step applies these rules to get pairs that meet the first condition but not the second.
The having essentially removes any pair that has NULL values. When you count() a column, NULL values are not included. In that case, the count() on the column is fewer than the number of values expected (NumField2).