4:**Count/sum rows in multiple related tables - sql

I have a complex select that - when simplified - looks like this:
select m.ID,
(select sum(AMOUNT) from A where M_ID = m.ID) sumA,
(select sum(AMOUNT) from B where M_ID = m.ID) sumB,
.....
from M;
The tables A,B,... have a foreign key M_ID pointing into table M.
The problem is that this select is very slow. I'd like to rewrite it using table joins, but I don't know how, because
select m.ID
sum(a.AMOUNT),
sum(b.AMOUNT),
.....
from M
join A on a.M_ID = m.ID
join B on b.M_ID = m.ID
....
group by m.ID;
gives incorrect (much higher) sum results, as each row in A or B can be counted multiple times.
Is there a way how to write that select optimally using e.g. analytical functions or some other ways?
Edit:
The explain plan for the original (not simplified) select looks like this:
| 0 | SELECT STATEMENT | |
| 1 | SORT AGGREGATE | |
|* 2 | FILTER | |
|* 3 | TABLE ACCESS BY INDEX ROWID| WORKITEM |
|* 4 | INDEX SKIP SCAN | WORKITEM_U01 |
|* 5 | FILTER | |
|* 6 | TABLE ACCESS FULL | RPRODUCT_INVENTORY_MASTER |
.....
| 31 | SORT AGGREGATE | |
|* 32 | FILTER | |
|* 33 | TABLE ACCESS BY INDEX ROWID| WORKITEM |
|* 34 | INDEX SKIP SCAN | WORKITEM_U01 |
|* 35 | FILTER | |
|* 36 | TABLE ACCESS FULL | RPRODUCT_INVENTORY_MASTER |
| 37 | SORT GROUP BY | |
| 38 | TABLE ACCESS FULL | RPRODUCT |
That's why I want to optimize it. Moreover, the AWR report shows that this select has 50000 gets/exec.
Edit2,3:
The whole select looks like this:
SELECT rprd.ID,
rprd.NAME,
(select sum(AMOUNT) from WORKITEM
where ACTION='REMOVE'
and trunc(CREATED_DATE) = to_date(:1,'DDMMYYYY')
and PAYEE_ID in
(select rim.RPRODUCT_ID from RPRODUCT_INVENTORY_MASTER rim
where rprd.ID = rim.RPRODUCT_ID
and rim.INVENTORY_DATE = to_date(:2,'DDMMYYYY')),
.....
(select sum(AMOUNT) from WORKITEM
where ACTION='COLLECT'
and trunc(CREATED_DATE) < to_date(:11,'DDMMYYYY')
and PAYEE_ID in
(select rim.RPRODUCT_ID from RPRODUCT_INVENTORY_MASTER rim
where rprd.ID = rim.RPRODUCT_ID
and rim.INVENTORY_DATE < to_date(:12,'DDMMYYYY'))
FROM RPRODUCT rprd
GROUP BY rprd.ID, rprd.NAME
ORDER BY rprd.ID
;
I didn't write it :-), I'm about to re-write it. Note, there are differences in comparison operators, in ACTION values, in dates to compare INVENTORY_DATE to.
Edit4:
I tried to rewrite the query like this (and the exec plan looks better), but have run into the "row multiplicity" issues described above:
with RPRODUCT_INVENTORY_MASTER# as (
select RPRODUCT_ID, min(INVENTORY_DATE) INVENTORY_DATE
from RPRODUCT_INVENTORY_MASTER
group by RPRODUCT_ID),
WORKITEM# as (
select AMOUNT, PAYEE_ID, ACTION, trunc(CREATED_DATE) CREATED_DATE
from WORKITEM
where ACTION in ('REMOVE','ADD','COLLECT')
)
select rprd.ID,
rprd.NAME,
-- sum(wip2.AMOUNT), -- this is singular because of '=' in inventory_date comparison
sum(abs(wip4.AMOUNT)),
.....
sum(wip12.AMOUNT)
from RPRODUCT rprd
left join RPRODUCT_INVENTORY_MASTER# rim4 on rim4.RPRODUCT_ID = rprd.ID
and rim4.INVENTORY_DATE <= to_date(:4 ,'DDMMYYYY')
left join WORKITEM# wip4 on wip4.PAYEE_ID = rim4.RPRODUCT_ID
and wip4.ACTION='REMOVE'
and wip4.CREATED_DATE = to_date(:3 ,'DDMMYYYY')
.....
left join RPRODUCT_INVENTORY_MASTER# rim12 on rim12.RPRODUCT_ID = rprd.ID
and rim12.INVENTORY_DATE < to_date(:12 ,'DDMMYYYY')
left join WORKITEM# wip12 on wip12.PAYEE_ID = rim12.RPRODUCT_ID
and wip12.ACTION='COLLECT'
and wip12.CREATED_DATE < to_date(:11 ,'DDMMYYYY')
group by rprd.ID, rprd.NAME
order by rprd.ID
;
RPRODUCT_INVENTORY_MASTER# always gives at most one row for each rprd.ID. WORKITEM# can have any number of rows for each RPRODUCT_ID = rprd.ID.

Yes, this is a typical problem. I like your original query for its clarity. However, if running in performence issues, one has to think of other options.
Here is one option. As A and B get multiplied you could simply divide the sum by the related count. Well, admittedly this looks kind of strange though.
select m.ID
sum(a.AMOUNT) / count(distinct b.id),
sum(b.AMOUNT) / count(distinct a.id),
.....
from M
join A on a.M_ID = m.ID
join B on b.M_ID = m.ID
....
group by m.ID;
The other option, which I would prefer is to build groups, so as not to have multiple A and B per m.id in the first place:
select m.ID
a_agg.SUM_AMOUNT,
b_agg.SUM_AMOUNT,
.....
from M
join (select M_ID, sum(AMOUNT) as SUM_AMOUNT from A group by M_ID) a_agg
on a_agg.M_ID = m.ID
join (select M_ID, sum(AMOUNT) as SUM_AMOUNT from B group by M_ID) b_agg
on b_agg.M_ID = m.ID
EDIT: In case an M_ID might not have any A or any B, you would have to replace the joins with LEFT JOIN in both queries. Then in the first query select:
nvl(sum(a.AMOUNT), 0) / greatest(count(distinct b.id), 1),
nvl(sum(b.AMOUNT), 0) / greatest(count(distinct a.id), 1),
And in the second query:
nvl(a_agg.SUM_AMOUNT, 0),
nvl(b_agg.SUM_AMOUNT, 0),
EDIT: Here is your query modified. The trick is to join with distinct rims.
SELECT
rprd.ID,
rprd.NAME,
nvl(same_date.SUM_AMOUNT, 0),
.....
nvl(earlier_date.SUM_AMOUNT, 0)
FROM RPRODUCT rprd
LEFT JOIN
(
select rim.RPRODUCT_ID, sum(w.AMOUNT) as SUM_AMOUNT
from
(
select distinct RPRODUCT_ID
from RPRODUCT_INVENTORY_MASTER
where INVENTORY_DATE = to_date(:2,'DDMMYYYY')
) rim
left join WORKITEM w
on w.PAYEE_ID = rim.RPRODUCT_ID
and w.ACTION = 'REMOVE'
and trunc(w.CREATED_DATE) = to_date(:1,'DDMMYYYY')
) same_date on same_date.RPRODUCT_ID = rprd.ID
LEFT JOIN
(
select rim.RPRODUCT_ID, sum(w.AMOUNT) as SUM_AMOUNT
from
(
select distinct RPRODUCT_ID
from RPRODUCT_INVENTORY_MASTER
where INVENTORY_DATE < to_date(:12,'DDMMYYYY')
) rim
left join WORKITEM w
on w.PAYEE_ID = rim.RPRODUCT_ID
and w.ACTION = 'REMOVE'
and trunc(w.CREATED_DATE) < to_date(:11,'DDMMYYYY')
) earlier_date on earlier_date.RPRODUCT_ID = rprd.ID
GROUP BY rprd.ID, rprd.NAME
ORDER BY rprd.ID
;

This should work
select m.ID,
a.aamount,
b.bamount
from M
inner join
(
select M_ID,sum(AMOUNT) as aamount
from A group by M_ID
) a
on a.M_ID = m.ID
inner join
(
select M_ID,sum(AMOUNT) as bamount
from B group by M_ID
) b
on b.M_ID = m.ID;

This should work regardlessly of number of m_id rows in A, B, C, ... tables:
select
M.id,
sum(decode(u.src, 'A', u.sumx, 0)) sum_a,
sum(decode(u.src, 'B', u.sumx, 0)) sum_b,
sum(decode(u.src, 'C', u.sumx, 0)) sum_c,
...
from M,
(select 'A' src, m_id, sum(amount) sumx from A group by m_id
union all
select 'B', m_id, sum(amount) from B group by m_id
union all
select 'C', m_id, sum(amount) from C group by m_id
...
) u
where
M.id=u.m_id
group by
M.id;

Related

SQL - Distinct count between two tables

I'm having a mind lapse on what I believe is a relatively easy script. Hopefully I'm overthinking the logic.
What I'm trying to do is perform two counts on a distinct column which is right joined.
What I want is:
count(a.book_id) as count_of_books
count(b.book_ref_number) as count_of_losses
Expected Output
--------------------------------------------------------
| Book | count_of_books | count of losses|
--------------------------------------------------------
|Hunger Games | 76 | 31 |
--------------------------------------------------------
|Hop on Pop | 27 | 6 |
--------------------------------------------------------
|Pout Pout Fish | 138 | 43 |
--------------------------------------------------------
I have tried a couple different scripts. Here are the two scripts I've tried.
(select count(*) from Inventory_Table x ) Count1,
(select count(*) from Loss_table b ) Count2
from Inventory_Table x
right join Loss_table b on b.book_ref_number = x.book_id
where rownum < 20
select
a.book_name,
count(distinct a.book_id),
count(b.book_ref_number)
from Inventory_Table x
right join Loss_table b on trim(b.book_ref_number) = trim(a.book_id)
Results I get
--------------------------------------------------------
| Book | count_of_books | count of losses|
--------------------------------------------------------
|Moby Dick | 4376 | 2574 |
--------------------------------------------------------
I'm looking for guidance in my neglectful mistake. Thank you in advance
and rownum <20 doesn't make sense. you are limiting your result set with 20 records.
try this:
select * from (
select
a.mrch_Nr,
count(distinct a.fdr_trac_nr),
count(b.auth_id)
from DATASTORE_FD.DEB_CRD_AUTH_LOG_REC a
right join jordab26.ft b on trim(b.auth_id) = trim(a.fdr_trac_nr)
where a.auth_log_dt between '20200101' and '20200408'
group by a.mrch_nr
)
where rownum < 20
Try this, I'm not sure about rownum < 20. Also, make sure your add correct group by condition.
select sum(case book_id when null then 0 else 1 end ) count_of_books,
sum(case book_ref_number when null then 0 else 1 end ) count_of_losses
from Inventory_Table x
right join Loss_table b on b.book_ref_number = x.book_id
where rownum < 20
Is this what you want?
Select distinct bookname,
count(distinct
a.bookid)+sum(
case when a.bookid IS NULL
THEN 1 END) ,
count(distinct b.id) as lossid
From inventary_table a
Left Join
Loss_table b
On
a.bookid=b.book_ref_number
SELECT book_name,COUNT(book_id),COUNT(book_ref_id) FROM Inventory_Table right join Loss_table on book_ref_number = book_id GROUP BY book_name
But if you need all the books in Inventory and only matching books from Loss_table then it should be left join:
SELECT book_name,COUNT(book_id),COUNT(book_ref_id) FROM Inventory_Table leftjoin Loss_table on book_ref_number = book_id GROUP BY book_name
0
SELECT book_name,COUNT(book_id),COUNT(book_ref_id)
FROM Inventory_Table
right join Loss_table on book_ref_number = book_id GROUP BY book_name

Counting Records with Unique Field Value

Source Table
Assuming I have a table called MyTable with the content:
+----------+------+
| Category | Code |
+----------+------+
| A | A123 |
| A | B123 |
| A | C123 |
| B | A123 |
| B | B123 |
| B | D123 |
| C | A123 |
| C | E123 |
| C | F123 |
+----------+------+
I'm trying to count the number of Code values which are unique to each category.
Desired Result
For the above example, the result would be:
+----------+-------------+
| Category | UniqueCodes |
+----------+-------------+
| A | 1 |
| B | 1 |
| C | 2 |
+----------+-------------+
Since C123 is unique to A, D123 is unique to B, and E123 & F123 are unique to C.
What I've Tried
I'm able to obtain the result for a single category (e.g. C) using a query such as:
SELECT COUNT(a.Code) AS UniqueCodes
FROM
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category = "C"
) a
LEFT JOIN
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category <> "C"
) b
ON a.Code = b.Code
WHERE b.Code IS NULL
However, whilst I can hard-code a query for each category, I cannot seem to construct a single query to calculate this for every possible Category value.
Here is what I've tried:
SELECT c.Category,
(
SELECT COUNT(a.Code)
FROM
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category = c.Category
) a
LEFT JOIN
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category <> c.Category
) b
ON a.Code = b.Code
WHERE b.Code IS NULL
) AS UniqueCodes
FROM
(
SELECT MyTable.Category
FROM MyTable
GROUP BY MyTable.Category
) c
Though, the c.Category is not defined within the scope of the nested SELECT query.
Could anyone advise how I could obtain the desired result?
I would use NOT EXISTS & do aggregation :
select category, count(*)
from MyTable t
where not exists (select 1 from MyTable t1 where t1.code = t.code and t1.category <> t.category)
group by category;
You can use two levels of aggregation:
select minc as category, count(*)
from (select code, min(category) as minc, max(category) as maxc
from t
group by code
) as c
where minc = maxc
group by minc;
This would also work:
select category, count(*) from(
select a.category, b.count from mytable a join (
select code, count(category) as count
from mytable
group by code
having count(category) = 1
) b on b.code = a.code
) c group by category
Learning from #isaace's answer, I also came up with this -
SELECT MyTable.Category, COUNT(*)
FROM
MyTable INNER JOIN
(SELECT Code FROM MyTable GROUP BY Code HAVING COUNT(Category) = 1) a
ON MyTable.Code = a.Code
GROUP BY MyTable.Category

SQL SELECT multiple SUM() error

I am having a problem involving multiple SUM() functions in a SQL SELECT statement using JOINs.
Whenever I sum together two values, it makes the value inside the other sum function double. How do I prevent this?
Example: SQL Fiddle - all X and Y values should be a 2.
I am using SQLite.
You can use UNION for this:
SELECT id, SUM(bamount) AS BAmount, SUM(camount) AS CAmount
FROM
(
SELECT a.id, SUM(b.amount) AS bamount, 0 AS camount
FROM a
LEFT JOIN b ON a.id = b.a_id
GROUP BY a.id
UNION ALL
SELECT a.id, 0, SUM(c.amount) AS camount
FROM a
LEFT JOIN c ON a.id = c.a_id
GROUP BY a.id
) AS t
GROUP BY id;
updated demo
This will give you:
| id | BAmount | CAmount |
|----|---------|---------|
| 1 | 2 | 2 |
| 2 | 2 | 2 |
| 3 | 2 | 2 |
You can try performing the aggregations in separate subqueries. This is one way to get around the problem of double (or triple, etc.) counting rows as the result of a join.
SELECT
a.id,
t1.b_sum AS x,
t2.c_sum AS y
FROM a
LEFT JOIN
(
SELECT a_id, SUM(amount) AS b_sum
FROM b
GROUP BY a_id
) t1
ON a.id = t1.a_id
LEFT JOIN
(
SELECT a_id, SUM(amount) AS c_sum
FROM c
GROUP BY a_id
) t2
ON a.id = t2.a_id;

In SQL Query a one-to-many relationship with condition

I have the following tables:
event_tbl
| event_id (PK) | event_date | event_location |
|---------------|------------|----------------|
| 1 | 01/01/2018 | Miami |
| 2 | 02/04/2018 | Tampa |
performer_tbl
| performer_id (PK) | event_id (FK) | genre |
|-------------------|---------------|-------|
| 1 | 1 | A |
| 2 | 1 | B |
| 3 | 2 | A |
| 4 | 2 | C |
I want to find events that have both genre A and genre B (should just return event 1), and I'm lost on writing the query. Maybe I just haven't had enough coffee, but all I can come up with is doing two derived columns with a case statement that count either genre and group by the event_id, then filtering both to >0. It just doesn't seem very elegant.
This should do the job (in MySQL, for other DBMS the syntax can be varied easily):
SELECT
e.event_id
FROM
event_tbl e
JOIN performer_tbl p USING(event_id)
GROUP BY e.event_id
HAVING SUM(IF(p.genre = 'A', 1, 0)) >= 1 AND SUM(IF(p.genre = 'B', 1, 0)) >= 1;
if you are using sql server, check below:
Select * From
event_tbl
where event_id
IN
(
select event_id
from performer_tbl as A
where exists (select 1
from perfoermer_tbl as B
where B.event_id = A.event_id and B.genre = 'A')
and
exists (select 1
from perfoermer_tbl as B
where B.event_id = A.event_id and B.genre = 'B')
)
This should work in any SQL database (at least in mysql, sql server, postgres or oracle)
select event_tbl.* FROM (
select event_id
from performer_tbl
where genre = 'A'
GROUP BY event_id) a_t
INNER JOIN (select event_id
from performer_tbl
where genre = 'B'
GROUP BY event_id) b_t
ON a_t.event_id = b_t.event_id
INNER JOIN event_tbl
ON event_tbl.event_id = a_t.event_id
This also works using left joins: (Since there are no function calls or sub-selects, it is fast. Also, it's usable in most SQL engines.)
SELECT DISTINCT
p1.event_id
,e.event_date
,e.event_location
FROM
performer_tbl as p1
inner join event_tbl as e on
p1.event_id = e.event_id
left outer join performer_tbl as p2 on
p1.event_id = p2.event_id
AND p2.genre = 'A'
left outer join performer_tbl as p3 on
p1.event_id = p3.event_id
AND p3.genre = 'B'
WHERE
p2.genre IS NOT NULL
AND p3.genre IS NOT NULL;
If I correctly understand what you need, you can try this:
Select *
from event_tbl e
where exists (select *
from performer_tbl p
where p.event_id = e.event_id
and p.genre in ('A', 'B'))

Get count of related records in two joined tables

Firstly, I apologize for my English. I want get auctions with count of bids and buys. It should look like this:
id | name | bids | buys
-----------------------
1 | Foo | 4 | 1
2 | Bar | 0 | 0
I have tables like following:
auction:
id | name
---------
1 | Foo
2 | Bar
auction_bid:
id | auction_id
---------------
1 | 1
2 | 1
3 | 1
4 | 1
auction_buy:
id | auction_id
---------------
1 | 1
I can get numbers in two queries:
SELECT *, COUNT(abid.id) AS `bids` FROM `auction` `t` LEFT JOIN auction_bid abid ON (t.id = abid.auction) GROUP BY t.id
SELECT *, COUNT(abuy.id) AS `buys` FROM `auction` `t` LEFT JOIN auction_buy abuy ON (t.id = abuy.auction) GROUP BY t.id
But when i combined it into one:
SELECT *, COUNT(abid.id) AS `bids`, COUNT(abuy.id) AS `buys` FROM `auction` `t` LEFT JOIN auction_bid abid ON (t.id = abid.auction) LEFT JOIN auction_buy abuy ON (t.id = abuy.auction) GROUP BY t.id
It was returning wrong amount (bids as much as buys).
How to fix this and get counts in one query?
You'll need to count DISTINCT abuy and abid IDs to eliminate the duplicates;
SELECT t.id, t.name,
COUNT(DISTINCT abid.id) `bids`,
COUNT(DISTINCT abuy.id) `buys`
FROM `auction` `t`
LEFT JOIN auction_bid abid ON t.id = abid.auction_id
LEFT JOIN auction_buy abuy ON t.id = abuy.auction_id
GROUP BY t.id, t.name;
An SQLfiddle to test with.
Try this:
SELECT t.*,COUNT(abid.id) as bids,buys
FROM auction t LEFT JOIN
auction_bid abid ON t.id = abid.auction_id LEFT JOIN
(SELECT t.id, Count(abuy.id) as buys
FROM auction t LEFT JOIN
auction_buy abuy ON t.id = abuy.auction_id
GROUP BY t.id) Temp ON t.id=Temp.id
GROUP BY t.id
Result:
ID NAME BIDS BUYS
1 Foo 2 0
2 Bar 1 1
Result in SQL Fiddle.