SQL aggregation query, grouping by entries in junction table - sql

I have TableA in a many-to-many relationship with TableC via TableB. That is,
TableA TableB TableC
id | val fkeyA | fkeyC id | data
I wish the do select sum(val) on TableA, grouping by the relationship(s) to TableC. Every entry in TableA has at least one relationship with TableC. For example,
TableA
1 | 25
2 | 30
3 | 50
TableB
1 | 1
1 | 2
2 | 1
2 | 2
2 | 3
3 | 1
3 | 2
should output
75
30
since rows 1 and 3 in Table have the same relationships to TableC, but row 2 in TableA has a different relationship to TableC.
How can I write a SQL query for this?

SELECT
sum(tableA.val) as sumVal,
tableC.data
FROM
tableA
inner join tableB ON tableA.id = tableB.fkeyA
INNER JOIN tableC ON tableB.fkeyC = tableC.id
GROUP by tableC.data
edit
Ah ha - I now see what you're getting at. Let me try again:
SELECT
sum(val) as sumVal,
tableCGroup
FROM
(
SELECT
tableA.val,
(
SELECT cast(tableB.fkeyC as varchar) + ','
FROM tableB WHERE tableB.fKeyA = tableA.id
ORDER BY tableB.fkeyC
FOR XML PATH('')
) as tableCGroup
FROM
tableA
) tmp
GROUP BY
tableCGroup

Hm, in MySQL it could be written like this:
SELECT
SUM(val) AS sumVal
FROM
( SELECT
fkeyA
, GROUP_CONCAT(fkeyC ORDER BY fkeyC) AS grpC
FROM
TableB
GROUP BY
fkeyA
) AS g
JOIN
TableA a
ON a.id = g.fkeyA
GROUP BY
grpC

SELECT sum(a.val)
FROM tablea a
INNER JOIN tableb b ON (b.fKeyA = a.id)
GROUP BY b.fKeyC

It seems that is it needed to create a key_list in orther to allow group by:
75 -> key list = "1 2"
30 -> key list = "1 2 3"
Because GROUP_CONCAT don't exists in T-SQL:
WITH CTE ( Id, key_list )
AS ( SELECT TableA.id, CAST( '' AS VARCHAR(8000) )
FROM TableA
GROUP BY TableA.id
UNION ALL
SELECT TableA.id, CAST( key_list + ' ' + str(TableB.id) AS VARCHAR(8000) )
FROM CTE c
INNER JOIN TableA A
ON c.Id = A.id
INNER join TableB B
ON B.Id = A.id
WHERE A.id > c.id --avoid infinite loop
)
Select
sum( val )
from
TableA inner join
CTE on (tableA.id = CTE.id)
group by
CTE.key_list

Related

How to select rows by max value from another column in Oracle

I have two datasets in Oracle Table1 and Table2.
When I run this:
SELECT A.ID, B.NUM_X
FROM TABLE1 A
LEFT JOIN TABLE2 B ON A.ID=B.ID
WHERE B.BOOK = 1
It returns this.
ID NUM_X
1 10
1 5
1 9
2 2
2 1
3 20
3 11
What I want are the DISTINCT ID where NUM_X is the MAX value, something like this:
ID NUM_x
1 10
2 2
3 20
You can use aggregation:
SELECT A.ID, MAX(B.NUM_X)
FROM TABLE1 A LEFT JOIN
TABLE2 B
ON A.ID = B.ID
WHERE B.BOOK = 1
GROUP BY A.ID;
If you wanted additional columns, I would recommend window functions:
SELECT A.ID, MAX(B.NUM_X)
FROM TABLE1 A LEFT JOIN
(SELECT B.*,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY NUM_X DESC) as seqnum
FROM TABLE2 B
) B
ON A.ID = B.ID AND B.seqnum = 1
WHERE B.BOOK = 1
GROUP BY A.ID;

How to select rows from another table based on a table containing min and max values

TableA:
Userid sessionid domain_value tag
---------------------------------
1 20 amex bank
1 40 visa bank
2 10 citibank bank
2 20 amex bank
2 30 amex bank
TableB:
Userid sessionid(min) sessionid(max)
------------------------------------
1 20 40
2 10 30
3
4
5
How to retrieve all the rows from TableA based on values in TableB?
select *
from TableA a
inner join TableB b on a.userid = b.userid
where a.sessionid between (select b.[sessionid(min)] from TableB b)
and (select b.sessionid(max)] from TableB b)
assumin table B with these column name
Userid, sessionid_min, sessionid_max,
try using just between column_fom_min and column_for_max
select *
from TableA a
inner join TableB b on a.userid = b.userid
where a.sessionid between b.sessionid_min
and b.sessionid_max
How to retrieve all the rows from TableA based on values in TableB?
In order to retrieve rows from TableA, select from TableA. If you want to filter based on values in TableB, place an according WHEREclause in the query. There is no need to join here.
select *
from tablea a
where exists
(
select null
from tableb b
where a.sessionid between b.sessionid_min and b.sessionid_max
)
order by userid, sessionid;
(You can achieve the same with a join, but the intention would not be as clear from reading the query.)
You may try this
select t.Userid ,
(select min(sessionid) from TableA as tb on tb.Userid =t.Userid ) as Min,
(select max(sessionid) from TableA as tb on tb.Userid =t.Userid ) as Max
from TableA as t group by t.Userid

Tips for Creating Summary Count of Value in other Tables

I have multiple tables with a status column in each. I want to display a summary of the counts of each status per table. Something like this:
=============================================
Status | Table A | Table B | Table C |
Status A | 3 | 8 | 2 |
Status B | 5 | 7 | 4 |
==============================================
I need help getting started as I'm not sure how to approach this issue. I can do simple COUNT functions like:
SELECT status, count(status) from TABLE_A group by status
But I'm not sure how to populate the data in the form I want or how to, if possible, use the table names as the column headers. I'd appreciate a point in the right direction. Thanks!
May be try doing left joins after you have calculated counts for each table separately.Something like:
select distinct t1.status,
count(t1.status) as [tableA],
t2.TableB,
t3.TableC from Table A t1
left join (
select distinct status,
count(status) as [TableB] from Table B
group by status
) t2 on t1.status=t2.status
left join (
select distinct status,
count(status) as [TableC] from Table C
group by status
) t3 on t1.status=t3.status
group by t1.Status
I would use union all and aggregation:
select status, sum(a) as a, sum(b) as b, sum(c) as c
from ((select status, count(*) as a, 0 as b, 0 as c
from tablea
group by status
) union all
(select status, 0, count(*), 0
from tableb
group by status
) union all
(select status, 0, 0, count(*)
from tablea
group by status
)
) abc
group by status;
This ensures that all rows appear, even when one or more tables are missing some values of status.
could be using left join
select t.status, a.cnt A, b.cnt B,c.cnt C
from(
select status
from tableA
union
select status
from tableB
select status
from tableC
) t
left join (
select status, count(*) cnt
from tableA
group by status
) a ON on t.status = a.status
left join (
select status, count(*) cnt
from tableB
group by status
) b ON on t.status = b.status
left join (
select status, count(*) cnt
from tableC
group by status
) c ON on t.status = c.status

sql joining table, not providing perfect result

let me explain what i need, i have 2 table named A and B. B is sub table for A.
Here is Schema:
------------------------
Table B:
itemId version qty AId
44 1 1 200
44 1 2 201
44 2 2 200
------------------------
Table A:
id tId
200 100
201 100
------------------------
and here is what i need: i need sum of all latest version qty that have same tId.
here is my query:
select sum(qty) as sum from B
left join A on A.id=B.AId
where itemId=44 and tId=100 and
version=(select max(version) from B where itemId=44 and tId=100)
the result get wrong when one item got version 2 and version 1 ignored.
thanks.
EDIT:
what exactly i need is:
itemId version qty AId
44 2 2 200
44 1 2 201
And Result of Sum(qty) must be 4, because they have same tId and they have Max version in each AId.
Use window function.
select itemid, version, qty, aid
from (
select *, max(version) over (partition by AId) as latestVersion
from B
) as B
where version = latestVersion
to sum up
select tId, SUM(qty) AS qty_sum
from (
select *, max(version) over (partition by AId) as latestVersion
from B
) as B
join A on B.AId = A.id
where version = latestVersion
group by tId
Working solution
select b.* from B as b
inner join
(select AID,itemId,Max(version) as mVersion from B
group by AID,itemID) d
on b.AID = d.AID and b.itemID = d.itemID and b.Version = d.mVersion
inner join A as a
on B.AID = a.id
where b.itemID = 44 --apply if you need
result
itemid version qty aid
44 2 2 200
44 1 2 201
this will give you result as you sum of quantity
select itemID,sum(qty) from (
select b.* from B as b
inner join
(select AID,itemId,Max(version) as mVersion from B
group by AID,itemID) d
on b.AID = d.AID and b.itemID = d.itemID and b.Version = d.mVersion
inner join A as a
on B.AID = a.id
where b.itemID = 44 --apply if you need
) e group by itemID
result
itemid sum
44 4
Try This one
DECLARE #TA Table (id int,tid int)
DECLARE #TB Table (itemid int, version int,qty int,AID int)
INSERT INTO #TA
SELECT 200, 100
UNION ALL
SELECT 201, 100
INSERT INTO #TB
SELECT 44,1,1,201
UNION ALL
SELECT 44,1,2,200
UNION ALL
SELECT 44,2,3,200
UNION ALL
SELECT 44,2,5,201
DECLARE #tid int
SET #tid = 100
SELECT XB.* FROM #Tb XB INNER JOIN
(SELECT Version,Max(AID) Aid FROM #TA A INNER JOIN #TB B ON A.id = B.AID AND tid = #tid Group By Version) X
ON X.version = XB.version and XB.AID = X.Aid
i think this query help you to solve your problem
SELECT itemId, version, qty , AId FROM (
SELECT itemId, version, qty , AId FROM b
LEFT JOIN a ON (b.aid = a.id)
) temp
WHERE version = (SELECT MAX(version) FROM b WHERE b.aid = temp.aid)
and temp.tid = 100 and temp.itemId = 44
SELECT B.*
FROM B
INNER JOIN
(SELECT Aid,MAX(version) AS version FROM B WHERE itemId=44 GROUP BY AId) AS B1
ON B.Aid=B1.Aid
AND B.version=B1.version
INNER JOIN
(SELECT * FROM A WHERE tId=100) AS A
ON A.id=B.Aid
Order BY B.aid
For Sum of qty
SELECT SUM(B.qty)
FROM B
INNER JOIN
(SELECT Aid,MAX(version) AS version FROM B WHERE itemId=44 GROUP BY AId) AS B1
ON B.Aid=B1.Aid AND B.version=B1.version
INNER JOIN
(SELECT * FROM A WHERE tId=100) AS A
ON A.id=B.Aid
GROUP BY A.tid
Output
itemid version qty aid
44 2 2 200
44 1 2 201
Demo
http://sqlfiddle.com/#!17/092dd/5
The most efficient solution greatest-n-per-group problems in Postgres are typically using the (proprietary) operator distinct on ()
So to get the latest version for each a.id, you can use:
select distinct on (a.id) b.*
from a
join b on a.id = b.aid
order by a.id, b.version desc;
The above returns:
itemid | version | qty | aid
-------+---------+-----+----
44 | 2 | 2 | 200
44 | 1 | 2 | 201
You can then sum over the result:
select sum(qty)
from (
select distinct on (a.id) b.qty
from a
join b on a.id = b.aid
order by a.id, b.version desc
) t;
Note that normally an order by in a derived table is useless, but in this case it's needed because otherwise distinct on () wouldn't work.
Online example: http://rextester.com/DRHK19268

Select count of rows in two other tables

I have 3 tables. The main one in which I want to retrieve some information and two others for row count only.
I used a request like this :
SELECT A.*,
COUNT(B.id) AS b_count
FROM A
LEFT JOIN B on B.a_id = A.id
WHERE A.id > 50 AND B.ID < 100
GROUP BY A.id
from Gerry Shaw's comment here. It works perfectly but only for one table.
Now I need to add the row count for the third (C) table. I tried
SELECT A.*,
COUNT(B.id) AS b_count
COUNT(C.id) AS c_count
FROM A
LEFT JOIN B on B.a_id = A.id
LEFT JOIN C on C.a_id = A.id
GROUP BY A.id
but, because of the two left joins, my b_count and my c_count are false and equal to each other. In fact my actual b_count and c_count are equal to real_b_count*real_c_count. Any idea of how I could fix this without adding a lot of complexity/subqueries ?
Data sample as requested:
Table A (primary key : id)
id | data1 | data2
------+-------+-------
1 | 0,45 | 0,79
----------------------
2 | -2,24 | -0,25
----------------------
3 | 1,69 | 1,23
Table B (primary key : (a_id,fruit))
a_id | fruit
------+-------
1 | apple
------+-------
1 | banana
--------------
2 | apple
Table C (primary key : (a_id,color))
a_id | color
------+-------
2 | blue
------+-------
2 | purple
--------------
3 | blue
expected result:
id | data1 | data2 | b_count | c_count
------+-------+-------+---------+--------
1 | 0,45 | 0,79 | 2 | 0
----------------------+---------+--------
2 | -2,24 | -0,25 | 1 | 2
----------------------+---------+--------
3 | 1,69 | 1,23 | 0 | 1
There are two possible solutions. One is using subqueries behind SELECT
SELECT A.*,
(
SELECT COUNT(B.id) FROM B WHERE B.a_id = A.id AND B.ID < 100
) AS b_count,
(
SELECT COUNT(C.id) FROM C WHERE C.a_id = A.id
) AS c_count
FROM A
WHERE A.id > 50
the second are two SQL queries joined together
SELECT t1.*, t2.c_count
FROM
(
SELECT A.*,
COUNT(B.id) AS b_count
FROM A
LEFT JOIN B on B.a_id = A.id
WHERE A.id > 50 AND B.ID < 100
GROUP BY A.id
) t1
JOIN
(
SELECT A.*,
COUNT(C.id) AS c_count
FROM A
LEFT JOIN C on C.a_id = A.id
WHERE A.id > 50
GROUP BY A.id
) t2 ON t1.id = t2.id
I prefer the second syntax since it clearly shows the optimizer that you are interested in GROUP BY, however, the query plans are usually the same.
If tables B & C also have their own key fields, then you can use COUNT DISTINCT on the primary key rather than foreign key. That gets around the multi-line problem you see on link to several tables. If you can post the table structures then we can advise further.
Try something like this
SELECT A.*,
(SELECT COUNT(B.id) FROM B WHERE B.a_id = A.id) AS b_count,
(SELECT COUNT(C.id) FROM C WHERE C.a_id = A.id) AS c_count
FROM A
That is the easier way I can think:
Create table #a (id int, data1 float, data2 float)
Create table #b (id int, fruit varchar(50))
Create table #c (id int, color varchar(50))
Insert into #a
SELECT 1, 0.45, 0.79
UNION ALL SELECT 2, -2.24, -0.25
UNION ALL SELECT 3, 1.69, 1.23
Insert into #b
SELECT 1, 'apple'
UNION ALL SELECT 1, 'banana'
UNION ALL SELECT 2, 'orange'
Insert into #c
SELECT 2, 'blue'
UNION ALL SELECT 2, 'purple'
UNION ALL SELECT 3, 'orange'
SELECT #a.*,
(SELECT COUNT(#b.id) FROM #b where #b.id = #a.id) AS b_count,
(SELECT COUNT(#c.id) FROM #c where #c.id = #a.id) AS b_count
FROM #a
ORDER BY #a.id
Result:
id data1 data2 b_count b_count
1 0,45 0,79 2 0
2 -2,24 -0,25 1 2
3 1,69 1,23 0 1
If table b and c have unique id, you can try this:
SELECT A.*,
COUNT(distinct B.fruit) AS b_count,
COUNT(distinct C.color) AS c_count
FROM A
LEFT JOIN B on B.a_id = A.id
LEFT JOIN C on C.a_id = A.id
GROUP BY A.id
See SQLFiddle MySQL demo.