Pairwise intersection counts of multiple columns in SQL - sql

Say I have a SQL db with multiple tables, and each of these tables has a column code. What I want to produce is a table showing the number of codes shared between each pair of code columns. I.e. a pairwise intersection count plot:
table1 table2 table3 table4
table1 10 2 3 5
table2 2 10 4 1
table3 3 4 10 2
table4 5 1 2 10
I can get each value individually using e.g.
WITH abc AS
(
SELECT code FROM table1
INTERSECT
SELECT code FROM table2
)
SELECT COUNT(*) AS table1Xtable2 FROM abc
But is there a query that will generate the entire output as desired as a table?

The following gets all combinations among the tables:
select t1, t2, t3, t4, count(*)
from (select code, max(t1) as t1, max(t2) as t2, max(t3) as t3, max(t4) as t4
from ((select code, 1 as t1, 0 as t2, 0 as t3, 0 as t4
from table1
) union all
(select code, 0 as t1, 1 as t2, 0 as t3, 0 as t4
from table2
) union all
(select code, 0 as t1, 0 as t2, 1 as t3, 0 as t4
from table3
) union all
(select code, 0 as t1, 0 as t2, 0 as t3, 1 as t4
from table4
)
) t
group by code
) t
group by t1, t2, t3, t4;
For your particular problem, you can use:
with t as (
select code, 'table1' as tablename from table1 union all
select code, 'table2' as tablename from table2 union all
select code, 'table3' as tablename from table3 union all
select code, 'table4' as tablename from table4
)
select t1.tablename,
sum(case when t2.tablename = 'table1' then 1 else 0 end) as t1,
sum(case when t2.tablename = 'table2' then 1 else 0 end) as t2,
sum(case when t2.tablename = 'table3' then 1 else 0 end) as t3,
sum(case when t2.tablename = 'table4' then 1 else 0 end) as t4
from t t1 join
t t2
on t1.code = t2.code
group by t1.tablename
Note that the above assumes that code is unique in the tables. If it is duplicated, you can replace union all with union.

Related

SQL join two tables that have the same columns, with an overlapping `id` column, but merge based on if table1.col1 >= table2.col1

I want to join two tables that have the same columns, with an overlapping id column, but merge based on if table1.col1 >= table2.col1. This is in SQL.
If table1.col1>=table2.col1, use the columns from table1.
If table1.col1< table2.col1, then use columns from table2.
If the id does not exist in table1 but exists in table2, use the columns from table2
If the id does not exist in table2 but exists in table1, use the columns from table1
For example:
Table1:
id
col1
col2
col3
A
3
5
4
B
1
2
3
C
8
9
7
Table2:
id
col1
col2
col3
A
2
5
6
B
5
7
8
D
2
3
4
I want the result to be:
id
col1
col2
col3
A
3
5
4
B
5
7
8
C
8
9
7
D
2
3
4
I have tried union, full outer join, and CASE statements, but am stuck
I think individual case expressions for each column might be best:
select id,
(case when t1.col1 < t2.col1 then t2.col1 else t1.col1 end) as col1,
(case when t1.col1 < t2.col1 then t2.col2 else t1.col2 end) as col2,
(case when t1.col1 < t2.col1 then t2.col3 else t1.col3 end) as col3
from t1 full join
t2
using (id);
If that is cumbersome, another approach uses not exists:
select t1.*
from t1
where not exists (select 1
from t2
where t2.id = t1.id and t2.col1 > t1.col1
)
union all
select t2.*
from t2
where not exists (select 1
from t1
where t2.id = t1.id and t1.col1 >= t2.col1
);
Another solution:
SELECT DISTINCT ON (id) *
FROM (
SELECT *
FROM table1
UNION ALL
SELECT *
FROM table2
) AS aux
ORDER BY id, col1 DESC;
I tried it in Postgresql.

Aggregate functions as column results from multiple tables

I have the following table structures:
Table1
--------------
Table1Id
Field1
Field2
Table2
------------
Table2Id
Table1Id
Field1
Field2
Table3
-----------
Table3Id
Table1Id
Field1
Field2
I need to be able to select all fields in Table1, count of records in Table2, and count of records in Table3 Where count of records in Table2 > count of records in Table3
Here is an example of expected output with the given data:
Table1 Data
-------------
1 Record1Field1 Record1Feild2
2 Record2Field1 Record2Feild2
3 Record3Field1 Record3Feild2
4 Record4Field1 Record4Feild2
Table2 Data
------------
1 1 Record1Field1 Record1Feild2
2 1 Record2Field1 Record2Feild2
3 2 Record3Field1 Record3Feild2
4 2 Record4Field1 Record4Feild2
5 2 Record5Field1 Record5Feild2
6 4 Record6Field1 Record6Feild2
7 4 Record6Field1 Record6Feild2
8 4 Record6Field1 Record6Feild2
Table3 Data
------------
1 2 Record1Field1 Record1Feild2
2 2 Record2Field1 Record2Feild2
3 3 Record3Field1 Record3Feild2
4 3 Record4Field1 Record4Feild2
5 3 Record5Field1 Record5Feild2
6 4 Record6Field1 Record6Feild2
Desired Results
Table1Id Field1 Field2 Table2Count Table3Count
1 Record1Field1 Record1Field2 2 0
2 Record2Field1 Recird2Field2 3 2
4 Record4Field1 Recird4Field2 3 1
Notice record 3 in Table 1 is not shown because the record count in Table2 is less than the record count in Table3. I was able to make this work using a very ugly query similar to the one below but feel there is a much better way to do this using joins.
SELECT
t1.Table1Id,
t1.Field1,
t1.Field2
(Select Count(Table2Id) From Table2 t2 Where t2.Table1Id = t1.Table1Id) as Table2Count,
(Select Count(Table3Id) From Table3 t3 Where t3.Table1Id = t1.Table1Id) as Table3Count,
From
Table1 t1
Where
(Select Count(Table2Id) From Table2 t2 Where t2.Table1Id = t1.Table1Id) > (Select Count(Table3Id) From Table3 t3 Where t3.Table1Id = t1.Table1Id)
Hard to test it without working examples but something along these lines should be a good starting point.
SELECT
t1.Table1Id,
t1.Field1,
t1.Field2,
COUNT(DISTINCT t2.Table2Id),
COUNT(DISTINCT t3.Table3Id)
From Table1 t1
LEFT OUTER JOIN Table2 t2 ON t1.Table1Id = t2.Table1Id
LEFT OUTER JOIN Table3 t3 ON t1.Table1Id = t3.Table1Id
GROUP BY t1.Table1Id
HAVING COUNT(DISTINCT t2.Table2Id) > COUNT(DISTINCT t3.Table3Id)
You could get all the value in t1 and the data form t2 e t3 for your comparision using a couple of join on grouped values
SELECT
t1.Table1Id
,t1.Field1
,t1.Field2
, tt2.count_t2
, tt3.count_t3
from table1 t1
join (
select Table1Id, count(*) count_t2
From Table2
group by Table1Id
) tt2 on tt2.Table1Id = t1.Table1Id
join (
select Table1Id, count(*) count_t3
From Table3
group by Table1Id
) tt3 on tt3.Table1Id = t1.Table1Id
where tt2.count_t2 < tt3.count_t3 <

SYBASE :Join Two table where Value of one table column is column name of other table

Table1
PRICE ID_1 ID_2 ID_3
500 1 2 3
750 2 3 4
Table2
ID VALUE
ID_1 1
ID_2 2
ID_3 3
I have two tables and want to join these tables like
Select * from table1 T1 Join Table2 T2 on
T1.(T2.ID) = T2.Value
In short I want to convert one table column value to other table column name at the time of joining.
EDITED
Result should be like this:
PRICE ID_1 ID_2 ID_3
500 1 2 3
You need to convert the rows into columns from the second table first, then join the two tables:
Select *
from table1 T1
join
(
SELECT
MAX(case when id = 'ID_1' THEN Value ELSE 0 END) AS ID_1,
MAX(case when id = 'ID_1' THEN Value ELSE 0 END) AS ID_2,
MAX(case when id = 'ID_1' THEN Value ELSE 0 END) AS ID_3
from table
) as T2 on T1.ID_1 = T2.ID_1
and T1.ID_2 = T2.ID_2
and T1.ID_3 = T2.ID_3
Or do it the other way, convert table a columns into rows:
SELECT *
FROM
(
SELECT 'ID_1' AS ID, ID_1 AS Value from table1
UNION ALL
SELECT 'ID_2' AS ID, ID_2 AS Value from table1
UNION ALL
SELECT 'ID_3' AS ID, ID_3 AS Value from table1
) AS t1
INNER JOIN Table2 as T2 on T1.ID_1 = T2.ID_1
and T1.ID_2 = T2.ID_2
and T1.ID_3 = T2.ID_3;
One method is:
Select *
from table1 T1 Join
Table2 T2
on t1.id_1 = T2.Value and t2.id = 'ID_1' or
t1.id_2 = T2.Value and t2.id = 'ID_2' or
t1.id_3 = T2.Value and t2.id = 'ID_3';
This is not efficient, but it should accomplish the logic you want.
EDIT:
Based on your edit, you appear to want:
select t1.*
from table1 t1
where exists (select 1 from table2 where t2.value = t1.id_1 and t2.id = 'ID_1') and
exists (select 1 from table2 where t2.value = t1.id_2 and t2.id = 'ID_2') and
exists (select 1 from table2 where t2.value = t1.id_3 and t2.id = 'ID_3') ;

SQL query with one to many relation

I have following table
Table1
id name col1 col2 col3 col4
-----------------------------------
1 test 1.1 1.2 1.3 1.4
2 test2 2.1 2.2 2.3 2.4
Table2
id fk_table1 amt type(fk_table3)
-----------------------------------
1 1 2 1
2 1 3 1
3 1 9 2
4 2 1 1
and I want to query such that I have get below result
id | name | total_type1_amt |total_type2_amt | col1 col2 col3 col4
-----------------------------------------------------------------------
1 test 5 (2+3) 9 1.1 1.2 1.3 1.4
2 test2 1 0 2.1 2.2 2.3 2.4
Basically in result I want group by table1.id with added columns for total_typeX_amt, there will be millions of rows in table1 and table2 so basically looking for optimized way to do it.
SELECT t1.id,
t1.name,
t2.total_type1_amt,
t2.total_type2_amt
FROM table1 t1
INNER JOIN
(
SELECT fk_table1,
SUM(CASE WHEN type = 1 THEN amt END) AS total_type1_amt,
SUM(CASE WHEN type = 2 THEN amt END) AS total_type2_amt
GROUP BY fk_table1
) t2
ON t1.id = t2.fk_table1
If you need this to run fast, you can try creating a view using the subquery (which I called t2 above), with an index on the fk_table1 column. Assuming that table1 also has an index on id, then the join should run reasonably fast.
It's not 100% your desired result, but you could try something like
select fk_table1, type, sum(amt)
from table1
inner join table2 on table1.id = table2.fk_table1
group by fk_table1, type
which should lead to something like
fk_table1 | type | sum
1 1 5
1 2 9
2 1 1
try dis to get total for total_type1_amt
select table1.id, table2.name ,(select count(table2.amt) as total_type1_amt where table1.id = table2.fk_table1 from table.1) from table1
inner join table2 on table1.id = table2.fk_table1
group by table.id
SELECT
T1.id,
T1.name,
SUM(CASE T2.type WHEN 1 THEN T2.amt ELSE 0 END) AS total_type1_amt,
SUM(CASE T2.type WHEN 2 THEN T2.amt ELSE 0 END) AS total_type2_amt
FROM #tbl1 T1
LEFT JOIN #tbl2 T2 ON T1.id=T2.fk_table1
GROUP BY T1.id,T1.name
Output:
You can try like this
;WITH cte
AS (SELECT
fk_table1, SUM([1]) total_type1_amt, COALESCE(SUM([2]), 0) total_type2_amt
FROM #table1 PIVOT (MAX(amt) FOR type IN ([1], [2])) p
GROUP BY fk_table1)
SELECT
t.id, t.name, c.total_type1_amt, c.total_type2_amt
FROM #table1 t
LEFT JOIN cte c
ON t.id = c.fk_table1
There at least 2 ways:
SELECT t1.id,
t1.name,
COALESCE(SUM(CASE WHEN [type] = 1 THEN amt END),0) AS total_type1_amt,
COALESCE(SUM(CASE WHEN [type] = 2 THEN amt END),0) AS total_type2_amt,
col1,
col2,
col3,
col4
FROM table1 t1
LEFT JOIN table2 t2
ON t1.id = t2.fk_table1
GROUP BY t1.id, t1.name, col1, col2, col3, col4
Another:
SELECT *
FROM (
SELECT t1.id,
t1.name,
t2.[type],
SUM(t2.amt) as sum
FROM table1 t1
LEFT JOIN table2 t2
ON t1.id = t2.fk_table1
GROUP BY t1.id, t1.name, t2.[type]
) as t
PIVOT (
MAX(sum) FOR type IN ([1],[2])
) as pvt

SQL - Multiple Table Joins

Hi maybe someone can help me here ...
I have a slight problem with an SQL Statement. ( On MS - SQL Server 2008)
So i have 6 Tables looking like this
ID / Company / Month / ClosedTimeStamp / Different Information
Now i need (preferrable in one Statement :P) the count of Datasets from each table grouped by Company and Month at the time it looks something like this.
And there is another thing not all tables need to have data for that Company and that Month so there can be 0 as Result for count(*)
SELECT COUNT(*) as c, Month, Company
FROM Table1 WHERE ClosedTimeStamp IS NULL
GROUP BY Company, Month
ORDER BY Company
I can do this for all the tables and just pick out the results for each company ... Well if someone has any Idea i really would appreciate it :)
Sorry forgot something ... the result should look like this:
Company / Month / CountTable1 / CountTable2 / CountTable3 / .....
Test 02 1 0 50
If it's not possible in one statement well then i have to make it work another way. :)
Thanks
Lim
UNION ALL table rows and then do the count
SELECT COUNT(*) as c, Month, Company
FROM
(
SELECT Month,Company FROM Table1 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company FROM Table2 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company FROM Table3 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company FROM Table4 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company FROM Table5 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company FROM Table6 WHERE ClosedTimeStamp IS NULL
) AS t
GROUP BY Company, Month
ORDER BY Company
If you want the total for each table,company in one row
SELECT SUM(t1) t1,SUM(t2) t2,SUM(t3) t3,SUM(t4) t4,SUM(t5) t5,SUM(t6) t6, Month, Company
FROM
(
SELECT Month,Company, 1 t1,0 t2, 0 t3, 0 t4, 0 t5, 0 t6 FROM Table1 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company, 0 t1,1 t2, 0 t3, 0 t4, 0 t5, 0 t6 FROM Table2 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company, 0 t1,0 t2, 1 t3, 0 t4, 0 t5, 0 t6 FROM Table3 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company, 0 t1,0 t2, 0 t3, 1 t4, 0 t5, 0 t6 FROM Table4 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company, 0 t1,0 t2, 0 t3, 0 t4, 1 t5, 0 t6 FROM Table5 WHERE ClosedTimeStamp IS NULL
UNION ALL
SELECT Month,Company, 0 t1,0 t2, 0 t3, 0 t4, 0 t5, 1 t6 FROM Table6 WHERE ClosedTimeStamp IS NULL
) AS t
GROUP BY Company, Month
ORDER BY Company
If your DB was normalized the query would be much simpler.
Because, your company and Month are spread across 6 tables, we need to make union of those tables in order to get the distinct dataset of all company+month, as such:
select company, month from table1
union
select company, month from table2
union
select company, month from table3
union
select company, month from table4
union
select company, month from table5
union
select company, month from table6
Note, that we need union, not union all, because we don't want the same company+month pair repeated.
Then, just use this dataset to query the quantities for each table:
select t.company, t.month,
(select count(*) from table1
where company = t.company
and month = t.month
and ClosedTimeStamp is null) as qt1,
(select count(*) from table2
where company = t.company
and month = t.month
and ClosedTimeStamp is null) as qt2,
(select count(*) from table3
where company = t.company
and month = t.month
and ClosedTimeStamp is null) as qt3,
(select count(*) from table4
where company = t.company
and month = t.month
and ClosedTimeStamp is null) as qt4,
(select count(*) from table5
where company = t.company
and month = t.month
and ClosedTimeStamp is null) as qt5,
(select count(*) from table6
where company = t.company
and month = t.month
and ClosedTimeStamp is null) as qt6
from (
select company, month from table1
union
select company, month from table2
union
select company, month from table3
union
select company, month from table4
union
select company, month from table5
union
select company, month from table6
) t
order by t.company