Having problem with joining condition while joining 3 tables - sql

I am having following structure of the tables:
Table A:
SSN a_id b_id. Date Sent
123 1 2 12/11/2020 1
Table B:
SSN a_id b_id Date. OPen
123 1 2 13/11/2020 1
123. 1 2. 14/11/2020 1
Table C:
SSN a_id b_id Date. Clicks
123 1 2 13/11/2020 1
123 1 2 14/11/2020 1
123 1 2 14/11/2020 1
123 1 2 14/11/2020 1
123 1 2 15/11/2020 1
I am using:
select *
from Table A
left join Table B on A.SSN = B.SSN and A.a_id = B.a_id and A.b_id = B.b_id
left join Table C on A.SSN = C.SSN and A.a_id = C.a_id and A.b_id = C.b_id
I want the following output:
Table Ans
SSN a_id b_id Date. Sent Open Clicks
123 1 2 12/11/2020 1 0 0
123 1 2 12/11/2020 0 1 1
123 1 2 12/11/2020 0 1 1
123 1 2 12/11/2020 0 0 1
123 1 2 12/11/2020 0 0 1
123 1 2 12/11/2020 0 0 1
The order of 1 and 0 in each column doesn't matter. But the count of it should be same as there in Original tables. How can I achieve this?

I assume that, this is the result set you actually wanted to get or something similar to it.
with a as (select * from (values (123,1,2,'2020-11-12',1)) a(ssn,a_id,b_id,"date",sent))
,b as(
select * from (values (123,1,2,'2020-11-13',1)
,(123,1,2,'2020-11-14',1)
) a(ssn,a_id,b_id,"date",open))
,c as
(select * from (values (123,1,2,'2020-11-13',1)
,(123,1,2,'2020-11-14',1)
,(123,1,2,'2020-11-14',1)
,(123,1,2,'2020-11-14',1)
,(123,1,2,'2020-11-15',1)
) a(ssn,a_id,b_id,"date",clicks)
)
select ssn, a_id, b_id,"date", sum(sent) as sent, sum(open) as open, sum(clicks) as clicks
from (
select ssn, a_id, b_id,"date",sent,0 as open,0 as clicks,"date" as hidendate from a
union all
select a.ssn,a.a_id,a.b_id,a."date",0 as sent,open,0 as clicks,b."date" as hidendate from a,b where a.a_id = b.a_id and a.b_id = b.b_id
union all
select a.ssn,a.a_id,a.b_id,a."date",0 as sent,0 as open,clicks,c."date" as hidendate from a,c where a.a_id = c.a_id and a.b_id = c.b_id
) q1
group by ssn,a_id,b_id,"date",hidendate
order by date

I think you want full join:
select *
from a full join
b
using (ssn, a_id, b_id, date) full join
c
using (ssn, a_id, b_id, date);
This returns the 0s as NULLs.
If you want 0s, use:
select ssn, a_id, b_id, date,
coalesce(a.sent, 0) as sent,
coalesce(b.open, 0) as open,
coalesce(c.click, 0) as click
from a full join
b
using (ssn, a_id, b_id, date) full join
c
using (ssn, a_id, b_id, date);

Related

Count elements in table a that have two exactly matching elements in table b

I have two tables and need to get a count of all entries from table A that have two specific matches in table B. Table B has tables A's Id as a foreign key.
Table a
ID Name
1 Foo
2 Bar
3 John
4 Jane
Table b
aID Value
1 12
1 12
2 8
3 8
3 12
4 12
4 8
I now need a count of all names in table A that have both value 8 AND 12 in table B at least once.
SELECT COUNT(*) FROM a join b on a.id = b.aId where b.value = 8 and b.value = 12
gets me 0 results. The correct result should be 2 (John and Jane).
edit:
Obviously, #Larnu is correct and 8 will never be 12.
Also, I should have clarified that there can be two or more of a single value in table B for any table A id, but none of the other (e.g. 8 twice but no 12). I updated the table to reflect that.
You can use EXISTS and HAVING:
SELECT COUNT(*) FROM a
WHERE EXISTS(SELECT b.aID FROM b
WHERE a.ID = b.aID
GROUP BY b.aID
HAVING COUNT(*) = 2)
If you want specifically value = 8 or 12 then add AND b.value IN(8,12) to the inner query
A subquery to get the number of times 8 and 12 appear for each row will do the trick:
select count(id) from
(select id, sum(case when b.Value = 8 then 1 else 0 end) as ct8,
sum(case when b.Value = 12 then 1 else 0 end) as ct12
from a inner join b on a.id = b.aID
group by a.id) as t
where ct8 >= 1 and ct12 >=1
Fiddle
Joining is not the answer here. You need a WHERE clause that includes a correlated subquery that checks your condition using COUNT() or EXISTS(). One of the following should do.
SELECT COUNT(*) FROM A
WHERE (SELECT COUNT(*) ​FROM B ​WHERE B.aID = A.ID ​AND B.VALUE IN (8, 12)) = 2
SELECT COUNT(*) FROM A
WHERE EXISTS(SELECT * FROM B WHERE B.aID = A.ID AND B.VALUE = 8)
AND EXISTS(SELECT * FROM B WHERE B.aID = A.ID AND B.VALUE = 12)

SQL get closest value by date

can't wrap my mind around the next problem
I have a table with historical data TableA:
uniq_id item_id item_clust date
11111 1 a 2020-02-12
11112 1 a 2020-01-13
11113 1 b 2020-02-01
11114 2 b 2020-01-01
I also have a table with historical data for clusters TableB:
item_id item_clust item_pos date
1 a 1 2020-01-01
1 a 2 2020-02-01
1 a 3 2020-03-01
1 b 1 2020-01-10
I would like to receive the latest position for every item_id + item_clust for date based on dates in TableB
If no rows found, I would like to insert item_pos = 0
Desired result:
uniq_id item_id item_clust date item_pos
11111 1 a 2020-02-12 2
11112 1 a 2020-01-13 1
11113 1 b 2020-02-01 1
11114 2 b 2020-01-01 0
So, for item 1 in cluster a on 2020-02-12 the latest position is at 2020-02-01 = 2.
This looks like a left join:
select a.*, coalesce(b.item_pos, 0) as item_pos
from a left join
(select distinct on (b.item_id, b.item_clust) b.*
from b
order by b.item_id, b.item_clust, b.date desc
) b
using (item_id, item_clust);
Or a lateral join:
select a.*, coalesce(b.item_pos, 0) as item_pos
from a left join lateral
(select b.*
from b
where b.item_id = a.item_id and
b.item_clust = a.item_clust
order by b.date desc
limit 1
) b
on true; -- always do the left join even when there are no matches
EDIT:
If you want the most recent position "as of" the date in A, then use the lateral join:
select a.*, coalesce(b.item_pos, 0) as item_pos
from a left join lateral
(select b.*
from b
where b.item_id = a.item_id and
b.item_clust = a.item_clust and
b.date <= a.date
order by b.date desc
limit 1
) b
on true; -- always do the left join even when there are no matches

Select count of rows in two other tables

I have 3 tables. The main one in which I want to retrieve some information and two others for row count only.
I used a request like this :
SELECT A.*,
COUNT(B.id) AS b_count
FROM A
LEFT JOIN B on B.a_id = A.id
WHERE A.id > 50 AND B.ID < 100
GROUP BY A.id
from Gerry Shaw's comment here. It works perfectly but only for one table.
Now I need to add the row count for the third (C) table. I tried
SELECT A.*,
COUNT(B.id) AS b_count
COUNT(C.id) AS c_count
FROM A
LEFT JOIN B on B.a_id = A.id
LEFT JOIN C on C.a_id = A.id
GROUP BY A.id
but, because of the two left joins, my b_count and my c_count are false and equal to each other. In fact my actual b_count and c_count are equal to real_b_count*real_c_count. Any idea of how I could fix this without adding a lot of complexity/subqueries ?
Data sample as requested:
Table A (primary key : id)
id | data1 | data2
------+-------+-------
1 | 0,45 | 0,79
----------------------
2 | -2,24 | -0,25
----------------------
3 | 1,69 | 1,23
Table B (primary key : (a_id,fruit))
a_id | fruit
------+-------
1 | apple
------+-------
1 | banana
--------------
2 | apple
Table C (primary key : (a_id,color))
a_id | color
------+-------
2 | blue
------+-------
2 | purple
--------------
3 | blue
expected result:
id | data1 | data2 | b_count | c_count
------+-------+-------+---------+--------
1 | 0,45 | 0,79 | 2 | 0
----------------------+---------+--------
2 | -2,24 | -0,25 | 1 | 2
----------------------+---------+--------
3 | 1,69 | 1,23 | 0 | 1
There are two possible solutions. One is using subqueries behind SELECT
SELECT A.*,
(
SELECT COUNT(B.id) FROM B WHERE B.a_id = A.id AND B.ID < 100
) AS b_count,
(
SELECT COUNT(C.id) FROM C WHERE C.a_id = A.id
) AS c_count
FROM A
WHERE A.id > 50
the second are two SQL queries joined together
SELECT t1.*, t2.c_count
FROM
(
SELECT A.*,
COUNT(B.id) AS b_count
FROM A
LEFT JOIN B on B.a_id = A.id
WHERE A.id > 50 AND B.ID < 100
GROUP BY A.id
) t1
JOIN
(
SELECT A.*,
COUNT(C.id) AS c_count
FROM A
LEFT JOIN C on C.a_id = A.id
WHERE A.id > 50
GROUP BY A.id
) t2 ON t1.id = t2.id
I prefer the second syntax since it clearly shows the optimizer that you are interested in GROUP BY, however, the query plans are usually the same.
If tables B & C also have their own key fields, then you can use COUNT DISTINCT on the primary key rather than foreign key. That gets around the multi-line problem you see on link to several tables. If you can post the table structures then we can advise further.
Try something like this
SELECT A.*,
(SELECT COUNT(B.id) FROM B WHERE B.a_id = A.id) AS b_count,
(SELECT COUNT(C.id) FROM C WHERE C.a_id = A.id) AS c_count
FROM A
That is the easier way I can think:
Create table #a (id int, data1 float, data2 float)
Create table #b (id int, fruit varchar(50))
Create table #c (id int, color varchar(50))
Insert into #a
SELECT 1, 0.45, 0.79
UNION ALL SELECT 2, -2.24, -0.25
UNION ALL SELECT 3, 1.69, 1.23
Insert into #b
SELECT 1, 'apple'
UNION ALL SELECT 1, 'banana'
UNION ALL SELECT 2, 'orange'
Insert into #c
SELECT 2, 'blue'
UNION ALL SELECT 2, 'purple'
UNION ALL SELECT 3, 'orange'
SELECT #a.*,
(SELECT COUNT(#b.id) FROM #b where #b.id = #a.id) AS b_count,
(SELECT COUNT(#c.id) FROM #c where #c.id = #a.id) AS b_count
FROM #a
ORDER BY #a.id
Result:
id data1 data2 b_count b_count
1 0,45 0,79 2 0
2 -2,24 -0,25 1 2
3 1,69 1,23 0 1
If table b and c have unique id, you can try this:
SELECT A.*,
COUNT(distinct B.fruit) AS b_count,
COUNT(distinct C.color) AS c_count
FROM A
LEFT JOIN B on B.a_id = A.id
LEFT JOIN C on C.a_id = A.id
GROUP BY A.id
See SQLFiddle MySQL demo.

Oracle SQL with Join and count

I would like to join two tables, A and B, with a count function.
Table A has the followings:
SQL> select a.book_id, count(a.book_id)
from
a
group by a.book_id ;
BOOK_ID COUNT(A.BOOK_ID)
--------- ----------------
1 2
2 2
3 2
4 2
5 2
6 3
and table B has the followings:
SQL> select b.book_id, count(b.book_id)
from
b
group by b.book_id ;
BOOK_ID COUNT(B.BOOK_ID)
--------- ----------------
6 2
So I would like to have a query which gives me the following result:
BOOK_ID COUNT(A.BOOK_ID) COUNT(B.BOOK_ID)
--------- ---------------- ----------------
1 2 0
2 2 0
3 2 0
4 2 0
5 2 0
6 3 2
I tried this :
SQL> select b.book_id, count(b.book_id),a.book_id, count(a.book_id)
from
b , a
where
b.book_id(+) = a.book_id
group by b.book_id, a.book_id ;
but the results were like this :
BOOK_ID COUNT(B.BOOK_ID) BOOK_ID COUNT(A.BOOK_ID)
--------- ---------------- --------- ----------------
0 1 2
0 2 2
0 3 2
0 4 2
0 5 2
6 6 6 6
Something like this perhaps:
select a.book_id as id_a,
(select count(1) from a a2 where a2.book_id = a.book_id) as count_a,
b.book_id as id_b,
(select count(1) from b b2 where b2.book_id = a.book_id) as count_b
from a
left join b on b.book_id = a.book_id
group by a.book_id;
Another way to do it:
WITH total_list AS (
SELECT a.Book_id, 'a' AS a_cnt, NULL AS b_cnt FROM a
UNION ALL
SELECT b.Book_id, NULL, 'b')
SELECT Book_id, COUNT(a_cnt) AS a_total, COUNT(b_cnt) AS b_total
GROUP BY Book_id
ORDER BY Book_id
What this does is it uses a subquery to union together both of your book tables and attaches a flag to it to signify if it belonged to table a or table b. From there, I just selected from the subquery.

Postgresql query with left join and having

Postgresql 9.1: I have a query that must return the values of a second table only if the aggregate function SUM of two columns is greater than zero.
This is the data:
Table a
id
---
1
2
3
Table b
id fk(table a)
---------------
1 1
2 null
3 3
Table c
id fk(table b) amount price
-----------------------------------
1 1 1 10 --positive
2 1 1 -10 --negative
3 3 2 5
As you can see, table b has some ids from table a, and table c can have 1 or more references to table b, table c is candidate to be retrieved only if the sum(amount * price ) > 0.
I wrote this query:
SELECT
a.id, b.id, SUM(c.amount * c.price) amount
FROM
tablea a
LEFT JOIN
tableb b ON b.fk = a.id
LEFT JOIN
tablec c ON c.fk = b.id
GROUP BY
a.id, b.id
HAVING
SUM(c.amount * c.price) > 0
But this query is not retrieving all rows from table a just the row 1 and I need the two rows. I understand this is happening because of the HAVING clause but I don't know how to rewrite it.
Expected result
a b sum
------------------
1 null null -- the sum of 1 * 10 (rows 1 and two) = 0 so its not retrieved.
2 null null -- no foreign key in second table
3 3 10 -- the sum of 2 * 5 (row 3) > 0 so it's ok.
Try this:
SELECT A.ID, B.ID, C.ResultSum
FROM TableA A
LEFT JOIN TableB B ON (B.FK = A.ID)
LEFT JOIN (
SELECT FK, SUM(Amount * Price) AS ResultSum
FROM TableC
GROUP BY FK
) C ON (C.FK = B.ID) AND (ResultSum > 0)
See demo here.