SQL Table A Left Join Table B And top of table B - sql

Im working myself into an SQL frenzy, hopefully someone out there can help!
I've got 2 tables which are basically Records and Outcomes, I want to join the 2 tables together, count the number of outcomes per record (0 or more) which I've got quite easily with:
Select records.Id, (IsNull(Count(outcomes.Id),0)) as outcomes
from records
Left Join
outcomes
on records.Id = outcomes.Id
group by
records.Id
The outcomes table also has a timestamp in it, what I want to do is include the last outcome in my result set, if I add that the my query it generates a record for every combination of records to outcomes.
Can any SQL expert point me in the right direction?
Cheers,

try:
SELECT
dt.Id, dt.outcomes,MAX(o.YourTimestampColumn) AS LastOne
FROM (SELECT --basically your original query, just indented differently
records.Id, (ISNULL(COUNT(outcomes.Id),0)) AS outcomes
from records
LEFT JOIN outcomes ON records.Id = outcomes.Id
GROUP BY records.Id
) dt
INNER JOIN outcomes o ON dt.Id = o.Id
GROUP BY dt.Id, dt.outcomes

Related

sql multiple left joins with sum

I have 3 tables as below. What I need to do is create a sumamry after left joining the 1st table to the 2nd and the 2nd to the 3rd.
The code I'm using ends up resulting in a cartesian join. My query to create the 1st table (person) is complicated and resource intensive while the volume of data is table 2(shopping list) is massive so having a nested query is not ideal. Below is the code I'm using right now and the expected output (image 1) & what I get (image 2)
select
a.ID,
a.Name,
sum(b.cost) total_cost,
sum(c.discount_amount) total_discount
from
person a,
left join shopping_list b on a.id=b.id
left join discount c on b.item = c.item
group by
a.ID,
a.Name
I've looked at the below links but I was hoping there's a solution that may work better give the size of my dataset
https://dba.stackexchange.com/questions/217220/how-i-use-multiple-sum-with-multiple-left-joins
Multiple Left Join with sum
Thanks in advance for your help
You have multiple rows for the discounts, so presummarize those:
select p.id, p.name, coalesce(sl.cost, 0) as cost,
coalesce(d.discount_amount, 0) as discount_amount
from person p left join
shopping_list sl
on sl.id = p.id left join
(select d.item, sum(discount_amount) as discount_amount
from discount
group by d.item
) d
on sl.item = d.item
group by p.id, p.name;
The problem with your query is that the multiple rows of discount end up multiplying the rows of shopping_list -- resulting in the inaccurate totals.
Notice that in this query, the table aliases are abbreviations for the table names. This is a best practice that makes it much, much easier to follow the logic of a query.

Inner Join Producing cartesian product

Looking at the 2 queries below, I assumed they would return the same result set but they're way off. Why is the 2 query with the inner join producing so many records? What am I doing wrong? I've been staring at this a little too long and need a fresh pair of eyes to look at it.
SELECT COUNT(*)
FROM ZCQ Z
WHERE Z.QUOTE_CUSTOMER_ID IN (SELECT CUSTOMER_ID FROM CUST_ORDER)
-- returned 6,646 RECS
SELECT COUNT(*)
FROM ZCQ Z
INNER JOIN CUST_ORDER CO ON zquote_customer_id = co.customer_id
-- returned 4,232,473 RECS
Please note these are Oracle 10g tables but have no FK or PK setup by the DBA.
No, these will not generally return the same result.
The first counts the number of rows in ZCQ that match a customer in CUST_ORDER.
The second counts the total number of rows that match. If there are duplicate customers in CUST_ORDER, then all duplicates will be counted.
You could get the same result using:
SELECT COUNT(DISTINCT z.zquote_customer_id)
FROM ZCQ Z JOIN
CUST_ORDER CO
ON zquote_customer_id = co.customer_id;
But IN or EXISTS is probably more efficient than removing the duplicates after doing the match.

Number of rows reduced after join

I have a query that gives me 759 rows:
Select
buildingID
,buildingAddress
,building_zip
From
BuildingTable
However, when I join the table to get a column from another table, the number of rows is reduced to 707
Select
buildingID
,buildingaddress
,building_zip
,b.surveyCost
From
BuildingTable as A
Inner Join SurveyTable as B
On a.buildingAddress = b.address
What is the best way to test which rows I lost and why? and how do I prevent this from happening? I was thinking that maybe some of the buildings don't have survey costs and therefore it was only showing me the ones that have costs, but I see some null values there so that was not the issue, I think.
If you need extra information let me know. Thanks
To find the rows you have lost, just replace the inner join with a left join and look for missing rows:
Select bt.*
From BuildingTable bt left join
SurveyTable st
on bt.buildingAddress = st.address
where st.address is null;
Note that rows can also be duplicated in both tables, so you might have more missing rows than you expect.
You can find the row which rows are lost usig a left join between original query and resulting query where the join is nulll
Select
buildingID
,buildingAddress
,building_zip
From
BuildingTable t1
left join (
Select
buildingID
,buildingaddress
,building_zip
,b.surveyCost
From
BuildingTable as A
Inner Join SurveyTable as B
On a.buildingAddress = b.address
) t2 on t1.buildingID = t2.buildingID
where t2.buildingID is null
but why .. depend on you knowledge of the data

Group on ID, and sum from different table (3 table total)

I have three tables,
Name - columns
Custpackingslipjour(table 1) - (deliverydate, dateclosed,, salesid)
Inventrans (Table 2) - (itemid, datephysical, transrefid)
Salesline (Table 3) - (lineamount, itemid, salesid
What i need is a single table containing all three tables, but without redundant information such as multiple of the same salesids, and with a total sum (salesline.lineamount) and where i can elminate results based on deliverydate and closed date, along with datephysical, originally i had this.
SELECT TOP (10) dbo.AX_CUSTPACKINGSLIPJOUR.DELIVERYDATE,
dbo.AX_SALESLINE.LINEAMOUNT, dbo.AX_INVENTTRANS.DATEPHYSICAL,
dbo.AX_INVENTTRANS.COSTAMOUNTPOSTED, dbo.AX_INVENTTRANS.ITEMID,
dbo.AX_INVENTTRANS.TRANSREFID, dbo.AX_CUSTPACKINGSLIPJOUR.SALESID
FROM dbo.AX_CUSTPACKINGSLIPJOUR
INNER JOIN
dbo.AX_SALESLINE ON dbo.AX_CUSTPACKINGSLIPJOUR.SALESID = dbo.AX_SALESLINE.SALESID
INNER JOIN
dbo.AX_INVENTTRANS ON dbo.AX_SALESLINE.ITEMID = dbo.AX_INVENTTRANS.ITEMID
But it produces a table with many of the same lines, because of the inherent one-to-many relationship between salesid and itemid.
So I thought i would group the sum of the value of the items associated with each salesid as that is what I am after.
Based on a few other posts on here, I tried combining the sum of lineamount from salesline by grouping it on salesid, but after 5 minutes of running, it did not appear to be a good solution, I think i have the right idea, but I am doing it wrong.
select top 10 s.SALESID,
SUM(sp.LINEAMOUNT) as 'TotalLineamount'
from AX_CUSTPACKINGSLIPJOUR as s
right outer join AX_SALESLINE as sp on s.SALESID=sp.SALESID
left outer join AX_INVENTTRANS as p on sp.ITEMID=p.ITEMID
where s.SALESID is not null
group by s.SALESID,sp.LINEAMOUNT
Any help is greatly appreciated and I will follow up any questions or comments if what I am trying to do is unclear.
Edit: Following #coltech's advice and trying with distinct on the salesid, did not work, but it was tested. I inserted the distinct(salesid) after the top 10
Edit: Following Gavins advice, i changed to
select top 10 cp.SALESID,
SUM(sl.LINEAMOUNT) as 'TotalLineamount'
from AX_CUSTPACKINGSLIPJOUR as cp
right outer join AX_SALESLINE as sl on cp.SALESID=sl.SALESID
left outer join AX_INVENTTRANS as it on sl.ITEMID=it.ITEMID
where cp.SALESID is not null
group by cp.SALESID
I also changed the prefixes for the tables to better match the table names. This works, however if I want to include more columns, for instance.
sl.LINEAMOUNT, it.DATEPHYSICAL, it.COSTAMOUNTPOSTED, it.ITEMID, it.TRANSREFID and cp.SALESID
then I will have to include these in the group by, as such.
select top 10 cp.SALESID,
SUM(sl.LINEAMOUNT) as 'TotalLineamount',
cp.DELIVERYDATE,
sl.LINEAMOUNT, it.DATEPHYSICAL,
it.COSTAMOUNTPOSTED, it.ITEMID,
it.TRANSREFID, cp.SALESID
from AX_CUSTPACKINGSLIPJOUR as cp
right outer join AX_SALESLINE as sl on cp.SALESID=sl.SALESID
left outer join AX_INVENTTRANS as it on sl.ITEMID=it.ITEMID
where cp.SALESID is not null
group by cp.SALESID, cp.DELIVERYDATE,sl.LINEAMOUNT, it.DATEPHYSICAL, it.COSTAMOUNTPOSTED, it.ITEMID, it.transrefid
However, this appears to cause the query to run for a very long time, been running for close to 25 mins. Is there a way to speed this up? Or am I just tackling this incorrectly?
I think you're fine except you shouldn't sum on the column you want to group by, so just do this instead:
select top 10 s.SALESID,
SUM(sp.LINEAMOUNT) as 'TotalLineamount'
from AX_CUSTPACKINGSLIPJOUR as s
right outer join AX_SALESLINE as sp on s.SALESID=sp.SALESID
left outer join AX_INVENTTRANS as p on sp.ITEMID=p.ITEMID
where s.SALESID is not null
group by s.SALESID

Left and right joining in a query

A friend asked me for help on building a query that would show how many pieces of each model were sold on each day of the month, showing zeros when no pieces were sold for a particular model on a particular day, even if no items of any model are sold on that day. I came up with the query below, but it isn't working as expected. I'm only getting records for the models that have been sold, and I don't know why.
select days_of_months.`Date`,
m.NAME as "Model",
count(t.ID) as "Count"
from MODEL m
left join APPLIANCE_UNIT a on (m.ID = a.MODEL_FK and a.NUMBER_OF_UNITS > 0)
left join NEW_TICKET t on (a.NEW_TICKET_FK = t.ID and t.TYPE = 'SALES'
and t.SALES_ORDER_FK is not null)
right join (select date(concat(2009,'-',temp_months.id,'-',temp_days.id)) as "Date"
from temp_months
inner join temp_days on temp_days.id <= temp_months.last_day
where temp_months.id = 3 -- March
) days_of_months on date(t.CREATION_DATE_TIME) =
date(days_of_months.`Date`)
group by days_of_months.`Date`,
m.ID, m.NAME
I had created the temporary tables temp_months and temp_days in order to get all the days for any month. I am using MySQL 5.1, but I am trying to make the query ANSI-compliant.
You should CROSS JOIN your dates and models so that you have exactly one record for each day-model pair no matter what, and then LEFT JOIN other tables:
SELECT date, name, COUNT(t.id)
FROM (
SELECT ...
) AS days_of_months
CROSS JOIN
model m
LEFT JOIN
APPLIANCE_UNIT a
ON a.MODEL_FK = m.id
AND a.NUMBER_OF_UNITS > 0
LEFT JOIN
NEW_TICKET t
ON t.id = a.NEW_TICKET_FK
AND t.TYPE = 'SALES'
AND t.SALES_ORDER_FK IS NOT NULL
AND t.CREATION_DATE_TIME >= days_of_months.`Date`
AND t.CREATION_DATE_TIME < days_of_months.`Date` + INTERVAL 1 DAY
GROUP BY
date, name
The way you do it now you get NULL's in model_id for the days you have no sales, and they are grouped together.
Note the JOIN condition:
AND t.CREATION_DATE_TIME >= days_of_months.`Date`
AND t.CREATION_DATE_TIME < days_of_months.`Date` + INTERVAL 1 DAY
instead of
DATE(t.CREATION_DATE_TIME) = DATE(days_of_months.`Date`)
This will help make your query sargable (optimized by indexes)
You need to use outer joins, as they do not require each record in the two joined tables to have a matching record.
http://dev.mysql.com/doc/refman/5.1/en/join.html
You're looking for an OUTER join. A left outer join creates a result set with a record from the left side of the join even if the right side does not have a record to be joined with. A right outer join does the same on the opposite direction, creates a record for the right side table even if the left side does not have a corresponding record. Any column projected from the table that does not have a record will have a NULL value in the join result.