Get count of foreign key from multiple tables - sql

I have 3 tables, with Table B & C referencing Table A via Foreign Key. I want to write a query in PostgreSQL to get all ids from A and also their total occurrences from B & C.
a | b | c
-----------------------------------
id | txt | id | a_id | id | a_id
---+---- | ---+----- | ---+------
1 | a | 1 | 1 | 1 | 3
2 | b | 2 | 1 | 2 | 4
3 | c | 3 | 3 | 3 | 4
4 | d | 4 | 4 | 4 | 4
Output desired (just the id from A & total count in B & C) :
id | Count
---+-------
1 | 2 -- twice in B
2 | 0 -- occurs nowhere
3 | 2 -- once in B & once in C
4 | 4 -- once in B & thrice in C
SQL so far SQL Fiddle :
SELECT a_id, COUNT(a_id)
FROM
( SELECT a_id FROM b
UNION ALL
SELECT a_id FROM c
) AS union_table
GROUP BY a_id
The query I wrote fetches from B & C and counts the occurrences. But if the key doesn't occur in B or C, it doesn't show up in the output (e.g. id=2 in output). How can I start my selection from table A & join/union B & C to get the desired output

If the query involves large parts of b and / or c it is more efficient to aggregate first and join later.
I expect these two variants to be considerably faster:
SELECT a.id,
, COALESCE(b.ct, 0) + COALESCE(c.ct, 0) AS bc_ct
FROM a
LEFT JOIN (SELECT a_id, count(*) AS ct FROM b GROUP BY 1) b USING (a_id)
LEFT JOIN (SELECT a_id, count(*) AS ct FROM c GROUP BY 1) c USING (a_id);
You need to account for the possibility that some a_id are not present at all in a and / or b. count() never returns NULL, but that's cold comfort in the face of LEFT JOIN, which leaves you with NULL values for missing rows nonetheless. You must prepare for NULL. Use COALESCE().
Or UNION ALL a_id from both tables, aggregate, then JOIN:
SELECT a.id
, COALESCE(ct.bc_ct, 0) AS bc_ct
FROM a
LEFT JOIN (
SELECT a_id, count(*) AS bc_ct
FROM (
SELECT a_id FROM b
UNION ALL
SELECT a_id FROM c
) bc
GROUP BY 1
) ct USING (a_id);
Probably slower. But still faster than solutions presented so far. And you could do without COALESCE() and still not loose any rows. You might get occasional NULL values for bc_ct, in this case.

Another option:
SELECT
a.id,
(SELECT COUNT(*) FROM b WHERE b.a_id = a.id) +
(SELECT COUNT(*) FROM c WHERE c.a_id = a.id)
FROM
a

Use left join with a subquery:
SELECT a.id, COUNT(x.id)
FROM a
LEFT JOIN (
SELECT id, a_id FROM b
UNION ALL
SELECT id, a_id FROM c
) x ON (a.id = x.a_id)
GROUP BY a.id;

Related

JOIN - Returns 5 instead of 3 rows for repeating values in ID column

I have the following two tables:
Table A:
id|valA
--+----
1 | a
2 | b
2 | c
Table B:
id|valA
--+----
1 | A
2 | B
2 | C
What I expect to get using join is:
id|valA|valB
--+----+----
1 | a | A
2 | b | B
2 | c | C
However, I get the following
id|valA|valB
--+----+----
1 | a | A
2 | b | B
2 | c | B
2 | c | C
2 | b | C
I am using the following query:
select id, valA, valB from A
inner join B
on A.id = B.id;
How do I use the join to get the expected results?
What am I doing wrong?
Both tables have a duplicate ID of 2, so you get a result with all the possible combinations of the two rows in each table with that ID (bB, bC, cB and cC). If you can rely on the values, you can add them to the join condition:
SELECT a.id, valA, valB
FROM a
JOIN b ON a.id = a.id and valA = LOWER(valB)
If you can't make this assumption, you could add a rank pseudocolumn to the values with the same ID and use in in the join condition too, although it may be a bit clunky:
SELECT idA, valA, valB
FROM (SELECT id AS idA, valA, RANK() OVER (PARTITION BY id ORDER BY valA) AS rankA
FROM a) a
JOIN (SELECT id AS idB, valB, RANK() OVER (PARTITION BY id ORDER BY valB) AS rankB
FROM b) b ON idA = idB AND rankA = rankB

How can I get full results in SQL query using 3 tables, where 1 of them keeps relation of 2 another?

I need help writing a query to display results I want.
"Table 3 - relations" keeps all relations between table 1 and 2.Often, relation between table 1 and 2 will not exist in table 3 so I want to see missing relation in the results for all Table 1 rows - see expected Results below.
I can't modify these tables - I have only SELECT privilege.
Data and expected result below:
Table 1 - a:
a_id, a_name
e.g.:
1 A
2 B
Table 2 - b:
b_id, b_name
e.g.:
1 X
2 Y
Table 3 - relation:
asset1_id (it's always id from Table 1), asset2_id (it's always id from Table 2), relation_type
e.g.:
1 1 covers
1 2 covers
Expected result:
Table1_name, Table2_name, Table3_relation_type (including NULL for b_name and relation_type when such relation does not exist in Table 3 - relation)
e.g.
A X covers
A Y covers
B NULL NULL
I can't get the 3rd expected line with NULLs.
I think that this query will produce those results.
select a.name as a_name,b.name as b_name, r.relation_type from relation r
join a on a.id=r.asset1_id
join b on b.id=r.asset2_id
union
select a.name as a_name,b.name as b_name,r.relation_type from relation r
full outer join a on a.id=r.asset1_id
full outer join b on b.id=r.asset2_id
where a.id is null or b.id is null
With your data sample you could try this one.
It should work both hive or impala.
SELECT t1.name ,t2.name ,r.relation_type
FROM relation r
FULL OUTER JOIN table1 t1 ON(t1.id = r.id1)
FULL OUTER JOIN table2 t2 ON(t2.id = r.id2);
+------+------+---------------+
| name | name | relation_type |
+------+------+---------------+
| A | X | covers |
| A | Y | covers |
| B | NULL | NULL |
+------+------+---------------+
WITH
cte_A AS (
SELECT id as a_id, name as a_name
FROM a
),
cte_C AS (
SELECT c.asset_id1 as a_id, b.name, c.relation
FROM c
LEFT JOIN b ON c.id=b.asset_id2
)
SELECT cte_A.a_name, cte_C.name as c_name, cte_C.relation
FROM cte_A
LEFT JOIN cte_C ON cte_A.a_id=cte_C.a_id

SQL query - select uncommon values from 2 tables

Today I was asked following question in my interview for a QA and because of incorrect query, I did not get selected. From then on, my mind is itching to get the correct answer for the following scenario:
I was given following 2 tables:
Tabel A | |Table B
--------- ----------
**ID** **ID**
-------- -----------
0 | | 5 |
1 | | 6 |
2 | | 7 |
3 | | 8 |
4 | | 9 |
5 | | 10|
6 | -----
----
And following output was expected using an SQL query:
**ID**
--------
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 7 |
| 8 |
| 9 |
| 10 |
--------
Thanks everyone, I really like this forum and from now on will be active here to learn more and more about SQL. I would like to make it my strong point rather a weak so as not to get kicked out of other interviews. I know there is a long way to go. However beside all of your responses, I came to draft the following query and would like to know from the experts here of their opinion about my query (and the reason why they think of what they think):
BTW the query has worked on MSSQLSRV-2008 (using Union or Union All, didn't matter to the result I got):
select ID from A where ID not in (5,6)
union
select ID from B where ID not in (5,6)
Is this really an efficient query?
If you want values in only one of two tables, I would use a full outer join and condition:
select coalesce(a.id, b.id)
from tableA a full outer join
tableB b
on a.id = b.id
where a.id is null or b.id is null;
Of course, if the job at a company that uses MS Access or MySQL, then this isn't the right answer, because these systems don't support full outer join. You can also do this in more complicated ways using union all and aggregation or even with other methods.
EDIT:
Here is another method:
select id
from (select a.id, 1 as isa, 0 as isb from tablea union all
select b.id, 0, 1 from tableb
) ab
group by id
having sum(isa) = 0 or sum(isb) = 0;
And another:
select id
from tablea
where a.id not in (select id from tableb)
union all
select id
from tableb
where b.id not in (select id from tablea);
As I think about this, it is a pretty good interview question (even though I've just given three reasonable answers).
Edit: See Gordon answer above for a better request, this is very inneficient way of doing what you want.
I think this should do the trick :
(SELECT * FROM A WHERE NOT id IN (SELECT A.id FROM A, B WHERE A.id = B.id))
UNION
(SELECT * FROM B WHERE NOT id IN (SELECT A.id FROM A, B WHERE A.id = B.id))
You could avoid the duplication of SELECT A.id ...by using a temporary table.
without full outer joins...
Select id
from (Select id from tableA
Union all
Select id from tableB) Z
group by id
Having count(*) = 1
or using Except and Intersect .....
(Select id from tableA Except Select id from tableB)
Union
(Select id from tableB Except Select id from tableA)
or ....
(Select id from tableA union Select id from tableB)
Except
(Select id from tableA intersect Select id from tableB)

SQL Join to Get Row with Maximum Value from Right table

I am having problem with sql join (oracle/ms sql)
I have two tables
A
ID | B_ID
---|------
1 | 1
1 | 4
2 | 3
2 | 2
----------
B
B_ID | B_VA| B_VB
-------|--------|-------
1 | 1 | a
2 | 2 | b
3 | 5 | c
4 | 2 | d
-----------------------
From these two tables I need A.ID, B.B_ID, B.B_VA (MAX), B.B_VB (with max B.B_VA)
So result table would be like
ID | B_ID | B_VA| B_VB
-------|--------|--------|-------
1 | 4 | 2 | d
2 | 3 | 5 | c
I tried some joins without success. Can anyone help me with query to get the result I want.
Thank you
Your logic as described doesn't quite correspond to the data. For instance, b_va is numeric, but the column in the output is a string.
Perhaps you want this. The data in a to be aggregated to get the maximum b_id value. Then each column to be joined to get the corresponding b_vb column. That, at least, conforms to your desired output:
select a.id, a.b_id, b1.b_vb as b_va, b2.b_vb
from (select id, max(b_id) as b_id
from a
group by id
) a join
b b1
on a.id = b1.b_id join
b b2
on a.b_id = b2.b_id;
EDIT:
For the corrected data, I think this is what you want:
select a.id, a.b_id, max(b1.b_va) as b_va, b2.b_vb
from (select id, max(b_id) as b_id
from a
group by id
) a join
b b1
on a.id = b1.b_id join
b b2
on a.b_id = b2.b_id
group by a.id, a.b_id, b2.b_vb;
Try this
SELECT X.ID, Y.B_ID, X.B_VA, Y.B_VB
FROM (SELECT A.ID, MAX(B_VA) AS B_VA
FROM A INNER JOIN B ON A.B_ID = B.B_ID
GROUP BY A.ID) AS X INNER JOIN
A AS Z ON X.ID = Z.ID INNER JOIN
B AS Y ON Z.B_ID=Y.B_ID AND X.B_VA=Y.B_VA

Trying to select multiple columns where one is unique

I am trying to select several columns from a table where one of the columns is unique. The select statement looks something like this:
select a, distinct b, c, d
from mytable
The table looks something like this:
| a | b | c | d | e |...
|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5
| 1 | 2 | 3 | 4 | 6
| 2 | 5 | 7 | 1 | 9
| 7 | 3 | 8 | 6 | 4
| 7 | 3 | 8 | 6 | 7
So the query should return something like this:
| a | b | c | d |
|---|---|---|---|
| 1 | 2 | 3 | 4
| 2 | 5 | 7 | 1
| 7 | 3 | 8 | 6
I just want to remove all of the rows where b is duplicated.
EDIT: There seems to be some confusion about which row I want to be selected in the case of duplicate b values. I don't care because the a, c, and d should (but are not guaranteed to) be the same.
Try this
SELECT * FROM (SELECT ROW_NUMBER() OVER (PARTITION BY b ORDER BY a) NO
,* FROM TableName) AS T1 WHERE NO = 1
I think you are nearly there with DISTINCT try:
SELECT DISTINCT a, b, c, d
FROM myTable
You haven't said how to pick a row for each b value, but this will pick one for each.
Select
a,
b,
c,
d,
e
From (
Select
a,
b,
c,
d,
e,
row_number() over (partition by b order by b) rn
From
mytable
) x
Where
x.rn = 1
If you don't care what values you get for B, C, D, and E, as long as they're appropriate for that key, you can group by A:
SELECT A, MIN(B), MIN(C), MIN(D), MIN(E)
FROM MyTable
GROUP BY A
Note that MAX() would be just as valid. Some RDBMSs support a FIRST() aggregate, or similar, for exactly these circumstances where you don't care which value you get (from a certain population).
This will return what you're looking for but I think your example is flawed because you've no determinism over which value from the e column is returned.
Create Table A1 (a int, b int, c int, d int, e int)
INSERT INTO A1 (a,b,c,d,e) VALUES (1,2,3,4,5)
INSERT INTO A1 (a,b,c,d,e) VALUES (1,2,3,4,6)
INSERT INTO A1 (a,b,c,d,e) VALUES (2,5,7,1,9)
INSERT INTO A1 (a,b,c,d,e) VALUES (7,3,8,6,4)
INSERT INTO A1 (a,b,c,d,e) VALUES (7,3,8,6,7)
SELECT * FROM A1
SELECT a,b,c,d
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY b ORDER BY a) RowNum ,*
FROM A1
) As InnerQuery WHERE RowNum = 1
You cannot put DISTINCT on a single column. You should put it right after the SELECT:
SELECT DISTINCT a, b, c, d
FROM mytable
It return the result you need for your sample table. However if you require to remove duplicates only from a single column (which is not possible) you probably misunderstood something. Give us more descriptions and sample, and we try to guide you to the right direction.