Group by + joins - sql

Hi I am having a problems using Group By and joins between 3 tables.
I have a project table with various fields and a projectcode fields. I then have an invoice table and an hours table and each can have multiple rows per project. Both of these table have project code also.
The two SUM values are not calculating correctly and I am realy struggling to see where the issue is.
Here the sql I am using:
SELECT dbo.project.projectcode,
dbo.project.client,
dbo.project.project,
dbo.project.budget,
dbo.project.budget * 80 AS value,
SUM(dbo.harvest.hours) AS hourslogged,
SUM(dbo.salesforce.value) AS invoiced
FROM dbo.salesforce
RIGHT OUTER JOIN dbo.project
ON dbo.salesforce.projectcode = dbo.project.projectcode
LEFT OUTER JOIN dbo.harvest
ON dbo.project.projectcode = dbo.harvest.projectcode
GROUP BY dbo.project.projectcode,
dbo.salesforce.projectcode,
dbo.harvest.projectcode,
dbo.project.project,
dbo.project.client,
dbo.project.budget
Any help or tips on this would be much appreciated!

Whenever each of the two tables, dbo.salesforce and dbo.harvest, have more than 1 match for every projectcode, a mini-Cartesian product happens. Here's a simple illustration. Suppose there are tables A and B, like this:
Table A:
AID AVALUE
--- -------
1 ValueA1
2 ValueA2
Table B:
BID BVALUE AID
--- ------- ---
1 ValueB1 1
2 ValueB2 1
3 ValueB3 2
Now if we performed this join:
SELECT * FROM A JOIN B ON A.AID = B.AID
the result would be:
AID AVALUE BID BVALUE AID
--- ------- --- ------- ---
1 ValueA1 1 ValueB1 1
1 ValueA1 2 ValueB2 1
2 ValueA2 3 ValueB3 2
Enter table C:
CID CVALUE AID
--- ------- ---
1 ValueC1 1
2 ValueC2 1
3 ValueC3 1
And the join now is this:
SELECT * FROM A JOIN B ON A.AID = B.AID JOIN C ON A.AID = C.AID
What would be the result? Here:
AID AVALUE BID BVALUE AID CID CVALUE AID
--- ------- --- ------- --- --- ------- ---
1 ValueA1 1 ValueB1 1 1 ValueC1 1
1 ValueA1 1 ValueB1 1 2 ValueC2 1
1 ValueA1 1 ValueB1 1 3 ValueC3 1
1 ValueA1 2 ValueB2 1 1 ValueC3 1
1 ValueA1 2 ValueB2 1 2 ValueC3 1
1 ValueA1 2 ValueB2 1 3 ValueC3 1
As you can see, every match from B is repeated three times, for how many matches C has got. And, similarly, every match from C is repeated twice, because that is how many matches there are in B. The 'luckiest', of course, is the row from A, because it is repeated 2 × 3 = 6 times. That is a Cartesian join for you. And that's just what happens in your case too.
Not sure whether it is considered typical, but in such cases I would often group each table separately by the joining expression(s), then join the result sets. Your query would then look like this:
SELECT
p.projectcode,
p.client,
p.project,
p.budget,
p.budget * 80 AS value,
h.hourslogged,
s.invoiced
FROM dbo.project p
LEFT JOIN (
SELECT
projectcode,
SUM(dbo.salesforce.value) AS invoiced
FROM dbo.salesforce
GROUP BY projectcode
) s ON p.projectcode = s.projectcode
LEFT JOIN (
SELECT
projectcode,
SUM(dbo.harvest.hours) AS hourslogged
FROM dbo.harvest
GROUP BY projectcode
) h ON p.projectcode = h.projectcode

I'd suggest to avoid mixing right and left outer join.
Your central table is Project, so use it first.
SELECT dbo.project.projectcode,
dbo.project.client,
dbo.project.project,
dbo.project.budget,
dbo.project.budget * 80 AS value,
SUM(dbo.harvest.hours) AS hourslogged,
SUM(dbo.salesforce.value) AS invoiced
FROM dbo.project
LEFT OUTER JOIN dbo.salesforce
ON dbo.salesforce.projectcode = dbo.project.projectcode
LEFT OUTER JOIN dbo.harvest
ON dbo.project.projectcode = dbo.harvest.projectcode
GROUP BY dbo.project.projectcode,
dbo.project.project,
dbo.project.client,
dbo.project.budget
But the error come from the GROUP BY. You don't have to group by the two tables on which you are doing the aggregate, else your aggregate will not be good !

Related

combine three tables in oracle sql

I'm having these three tables
Table A:
code aname
----------- ----------
1 A
2 B
3 C
Table B:
code bname
----------- ----------
1 aaa
1 bbb
2 ccc
2 ddd
Table C
code cname
----------- ----------
1 xxx
1 yyy
1 zzz
2 www
How can I achieve the output like this using single query ?
code aname bname cname
----------- ---------- ---------- ----------
1 A aaa xxx
1 A bbb yyy
1 A NULL zzz
2 B ccc www
2 B ddd NULL
3 C NULL NULL
Any suggestions ?
Thanks
It looks like you want "lists" to be vertical for tables B and C. This is doable, by using row_number(). However, the trick is getting the third row in where there are no matches.
Here is one method. It uses a full outer join to combine the b and c names together. It then uses left join to bring in the a records.
select a.code, a.name, bc.bname, bc.cname
from a left join
(select coalesce(b.code, c.code) as code, bname, cname
from (select code, bname, NULL as cname,
row_number() over (partition by code order by code) as seqnum
from b
) b full outer join
(select code, NULL as bname, cname,
row_number() over (partition by code order by code) as seqnum
from c
) c
on b.code = c.code and b.seqnum = c.seqnum
) bc
on bc.code = a.code;

SQL Join w/ some Math thrown in

Heck, maybe 'joining' isn't even involved. I'm way out of my sql league here. Could someone please help me out w/ the following:
Table A
ItemId ItemLookup Price
------- ---------- -----
1 123456 10.00
2 234567 7.00
3 345678 6.00
Table B
ItemId Location Qty QtyOnHold
------- ---------- ----- ---------
1 1 26 20
2 1 0 0
3 1 12 6
1 2 4 0
2 2 2 1
3 2 16 8
What I'm hoping to get is something that looks like
ItemLookup, Price, (qty minus qtyonhold for loc1), (qty minus qtyonhold for loc2)
or 123456, 10.00, 6, 4
Thank you very much for any direction you can provide.
You can use conditional aggregation and a join:
select a.ItemLookup,
sum(case when Location = 1 then Qty - QtyOnHold end) as Location1,
sum(case when Location = 2 then Qty - QtyOnHold end) as Location2
from tableb b join
tablea a
on b.ItemId = a.ItemId
group by a.ItemLookup;
Somthing like this
select tablea.* ,
(select (qty- QtyOnHold) as qty from tableb where ItemId = tablea.ItemId ans Location = 1 ) as qtyl1,
(select (qty- QtyOnHold) as qty from tableb where ItemId = tablea.ItemId ans Location = 2) as qtyl2
from tablea
This assumes that there's only one row in TableB for each ItemID + Location combination. This is basically just a "pivot", you can learn various ways to do this in MySQL here.
SELECT ItemLookup, Price,
MAX(IF(Location = 1, Qty-QtyOnHold, 0)) avail1,
MAX(IF(Location = 2, Qty-QtyOnHold, 0)) avail2
FROM TableA AS a
JOIN TableB AS b ON a.ItemId = b.ItemId
GROUP BY a.ItemId
It seems to me that it may be possible to have a variable number of locations for each item. If this is the case, you need an aggregate function to convert/concatenate multiple rows into a column.
Here's an example with MySQL's group_concat function:
select a.itemlookup,a.price,group_concat('loc ',location,'=',b.x order by location) as qty_minus_qtyonhold
from tablea a,(select itemid,location,qty-qty_onhold x from tableb
group by itemid,location) as b
where a.itemid = b.itemid
group by 1
You'll get a result like this:
itemlookup price qty_minus_qtyonhold
---------- ------ ------------------
123456 10.00 loc 1=6,loc 2=4
234567 7.00 loc 1=0,loc 2=1
345678 6.00 loc 1=6,loc 2=8
Not sure what DBMS you're using but there are similar alternatives for Oracle and SQL Server

query regarding joining of two tables

Suppose I have 2 below tables
sql> select * from fraud_types ;
fraud_id fraud_name
-------- ----------
1 Fraud 1
2 Fraud 2
3 Fraud 3
4 Fraud 4
5 Fraud 5
sql> select * from alarms ;
fraud_id dealer count
-------- ------ -----
1 Deal 1 5
3 Deal 1 3
5 Deal 1 4
1 Deal 2 2
2 Deal 2 6
3 Deal 2 1
4 Deal 2 7
5 Deal 2 9
I want to join the two tables and get the output as
dealer fraud_id count
------ -------- -----
Deal 1 1 5
Deal 1 2 0
Deal 1 3 3
Deal 1 4 0
Deal 1 5 4
Deal 2 1 2
Deal 2 2 6
Deal 2 3 1
Deal 2 4 7
Deal 2 5 9
Basically I want to include the fields from fraud_types also and just display 0 in the output if it is not present in the alarms table. How can I achieve this ? Please help
Regards
You can do this with a cross join to get all combinations and then a left outer join:
select d.dealer, f.fraud_id, coalesce(cnt, 0)
from (select distinct dealer from fraud_types) d cross join
fraud_types f left outer join
(select dealer, fraud_id, count(*) as cnt
from fraud_types
group by dealer, fraud_id
) df
on df.dealer = d.dealer and df.fraud_id = f.fraud_id
order by d.dealer, f.fraud_id;
Partitioned outer join is very useful for cases like this:
select a.dealer, f.fraud_id, nvl(a.count,0) count
from fraud_types f
left outer join alarms a
partition by (a.dealer)
on a.fraud_id = f.fraud_id
order by a.dealer, f.fraud_id
This does an outer join between alarms and fraud_types for every value of dealer found in alarms.
--
If the alarms table does not have (fraud,dealer) as key, then you can do a group by before the partition outer join:
select a.dealer, f.fraud_id, nvl(a.count,0) count
from fraud_types f
left outer join (
select fraud_id
, dealer
, sum(count) count
from alarms
group by fraud_id, dealer
) a
partition by (a.dealer)
on a.fraud_id = f.fraud_id
order by a.dealer, f.fraud_id
select distinct f.fraud_id,dealer,
(case when f.fraud_id=t.fraud_id then COUNT else 0 end) counts
from
fraud_types f
left join
alarms t
partition by (dealer)
on f.fraud_id=t.fraud_id
order by dealer

SQL query to select mapping table values?

I have a table named Patient in which I have columns like
ID Disease1 Disease2 Disease3
----------
1 4 3 2
----------
2 2 5
----------
3 6
----------
4 1
These are mapping values which I got from table Disease, in which disease names are placed like
1 hypertension
2 niddm
3 allergy
4 cough
5 floo
6 vv
etc
Now I want sql query to select
ID Disease1 Disease2 Disease3
----------
1 cough allergy niddm
----------
2 niddm floo
----------
3 vv
----------
4 HT
Please keep in mind that I have table mapped with 4,5 tables and I want original values in place of ids from all of them.
You need to join table Disease thrice on table Patient since there are three column from Patient that are dependent on Disease
SELECT a.ID,
b.Disease AS Disease1,
c.Disease AS Disease2,
d.Disease AS Disease3
FROM Patient a
LEFT JOIN Disease b
ON a.Disease1 = b.ID
LEFT JOIN Disease c
ON a.Disease2 = c.ID
LEFT JOIN Disease d
ON a.Disease3 = d.ID
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins

Using multiple joins (e.g left join)

I would like to know what's the logic for multiple joins (for example below)
SELECT * FROM B returns 100 rows
SELECT B.* FROM B LEFT JOIN C ON B.ID = C.ID returns 120 rows
As I know using left join will returns any matching data from the left table which is B if data are found for both table. But how come when using left join, it returns more data than table B itself?
What am I do wrong or misunderstood here? Any guidance are very appreciated. Thanks in advance.
Let be table B:
id
----
1
2
3
Let be table C
id name
------------
1 John
2 Mary
2 Anne
3 Stef
Any id from b is matched with ids from c, then id=2 will be matched twice. So a left join on id will return 4 rows even if base table B has 3 rows.
Now look at a more evil example:
Table B
id
----
1
2
2
3
4
table C
id name
------------
1 John
2 Mary
2 Anne
3 Stef
Every id from b is matched with ids from c, then first id=2 will be matched twice and second id=2 will be matched twice so the result of
select b.id, c.name
from b left join c on (b.id = c.id)
will be
id name
------------
1 John
2 Mary
2 Mary
2 Anne
2 Anne
3 Stef
4 (null)
The id=4 is not matched but appears in the result because is a left join.
Look at the following example :
B = {1,2}
C = {(1,a),(1,b),(1,c),(1,d),(1,e)}
The result of B left join C will be :
1 | a
1 | b
1 | c
1 | d
1 | e
2 | null
The number of rows in the result is definitely larger than rows in B (2).
In general the number of rows in result of B left join C is bounded by B.size + C.size and not only by B.size as you think...
As per your query it do the join to B Table with C and B table is Left Table so it will display all the records of Left table in our case it is B and related from other Table in our Case it is C.