Limiting join results - sql

Apologies for the vague question and if this has been asked before, I had a hard time figuring out to articulate the question.
I have three tables; Lot and Salesorder, and OrderLot:
Lot SKU CreationDate
-------------------------
1000-a 1000 2017-04-12
1000-b 1000 2017-04-13
2000-a 2000 2017-04-12
2000-b 2000 2017-04-13
SalesorderID Revenue
-----------------------------
1 $500
2 $250
3 $125
OrderLotID SalesorderID Lot
------------------------------
1 1 1000-a
2 1 2000-a
3 2 1000-b
4 2 2000-b
5 3 1000-a
I'd like to do a join which gives me the total revenue generated given the creation date of the lots in the SalesOrder.
For example, I'd like to use the CreationDate of 2017-04-12 and get the result of $625 (Lots 1000-a and 2000-a were created on this date, and they were used to "fill" SalesorderIDs 1 and 3). But the joins I'm currently using return two rows in the Salesorder 1 and the one row for Salesorder 3, and the result is $1125.
How do I limit the rows returned from the OrderLot so that only unique Salesorder revenue is counted?
Thanks,
jeff
edit. current query is:
select sum(so.revenue)
from salesorder so
inner join orderlot ol on so.lot = ol.lot
inner join lot l on ol.lot = l.lot
where l.creationdate = '2017-04-12'

SELECT SUM(s.Revenue)
FROM SalesOrder s
INNER JOIN (
SELECT DISTINCT SalesOrderID
FROM Lot l
INNER JOIN OrderLot ol on ol.Lot = l.Lot
WHERE l.CreationDate = #CreationDate
) t ON T.SalesOrderID = s.SalesOrderID
OR
SELECT SUM(s.Revenue)
FROM SalesOrder s
WHERE s.SalesOrderID IN (
SELECT DISTINCT SalesOrderID
FROM Lot l
INNER JOIN OrderLot ol on ol.Lot = l.Lot
WHERE l.CreationDate = #CreationDate
)
I find the second option with the IN() condition slightly easier to understand, but I tend to lean towards JOIN when possible, as it tends to perform a little better in my experience and it's easier to adapt it for something more complicated. And as always, if the performance matters that much you should actually profile the query and look at the execution plan. The optimizer can always surprise you.

Related

SQL joins show same record 3 times, instead of 3 records

I am working on a join exercise from Database Processing by Kroenke and Auer
There is a question which asks to find all the items shipped from Singapore displaying information from 3 different tables.
In the table there is 3 results which match these criteria.
I have tried a where join and an inner join, but each time instead of giving 3 results, it gives 1 result 3 times, which makes me convinced I'm messing something up with my syntax.
Here's the where join:
select shippername, shipment.shipmentId, departuredate
FROM shipment, item, SHIPMENT_ITEM
WHERE shipment_item.shipmentID = shipment.shipmentID
AND item.itemId = shipment_Item.itemID
AND item.city = 'Singapore';
And the inner join:
select shippername, shipment.shipmentId, departuredate
FROM shipment
INNER JOIN shipment_item ON shipment_item.shipmentID = shipment.shipmentID
INNER JOIN item ON item.itemId = shipment_Item.itemID
WHERE item.city = 'Singapore'
order by shippername asc,
departuredate desc;
The result of both queries:
shippername shipmentId departuredate
----------------------------------- ----------- -------------
International 4 2013-06-02
International 4 2013-06-02
International 4 2013-06-02
You can also add a Distinct after the select, but probably there's something different in each row, try select all columns as Gordon suggested to see what's different between them
Looks like more Items in One shipment,
So if you want Shipment based detail try GROUP_CONCAT with group by.
SELECT shippername,
shipment.shipmentId,
departuredate ,
GROUP_CONCAT(item.itemId) AS items_shiped
FROM shipment
INNER JOIN shipment_item ON shipment_item.shipmentID = shipment.shipmentID
INNER JOIN item ON item.itemId = shipment_Item.itemID
WHERE item.city = 'Singapore'
GROUP BY shipment.shipmentId
order by shippername asc,
departuredate desc;
Hope this helps.

How to write a SQL query that subtracts INNER JOIN results from LEFT JOIN results?

ere's an example: I want to see how good my marketing efforts are working for a product I'm trying to sell in a store. For instance, I want to know how many people bought my product within a month after they received a coupon for it in their email on 12/1/2014, compared to how many people bought my product in that same time period without ever receiving a coupon.
Here's a sample of my Customer table:
CUSTOMER_NUMBER PURCHASE_DATE
--------------- -------------
1 2014-12-02
2 2014-12-05
3 2014-12-05
4 2014-12-10
5 2014-12-21
Here's a sample of my Email table
CUSTOMER_NUMBER EMAIL_ADDR SEND_DATE
--------------- ------------ ----------
1 john#abc.com 2014-12-01
3 mary#xyz.com 2014-12-01
5 beth#def.com 2014-12-01
I have a pretty good idea how to determine who bought the product with the coupon: I use an inner join on the two tables. But in order to determine who bought the product anyway, even though they didn't have a coupon for whatever reason (they don't have email, they're part of a control group, etc.), I think I need to use a left join to get a result set, and then subtract the results of the inner join from my first result set. Alas, that is where I am stuck. In the example above, Customers 2 and 5 bought the product even though they never received a coupon, but I cannot figure out how to write a query to return that data.
I am using IBM's Netezza DB. Thank you!!
Use Left Outer Join with NULL check
SELECT C.*
FROM customer C
LEFT OUTER JOIN email e
ON C.customer_Number = E.customer_Number
WHERE E.customer_Number IS NULL
Or use Not Exists
SELECT *
FROM customer C
WHERE NOT EXISTS (SELECT 1
FROM email e
WHERE c.customer_number = e.customer_number)
select from customers c left outer join email e
on c.customer_number = e.customer_number
where e.customer_number is null
or C.purchase_date < e.send_date
SELECT
C.*,
PurchasedWithin30DaysOfEmailedCoupon =
CASE WHEN EXISTS (
SELECT *
FROM
Email E
WHERE
C.CustomerID = E.CustomerID
AND C.Purchase_Date <= E.Send_Date
AND E.Send_Date < DateAdd(day, 30, C.Purchase_Date)
) THEN 1 ELSE 0 END
FROM
Customer C
WHERE
C.PurchaseDate IS NOT NULL
;
Please forgive me for not knowing the correct syntax for adding 30 days to a date in your DBMS--I'm sure that will be a simple fix for you.
You can then simply group by the PurchasedWithin30DaysOfEmailedCoupon value and get your count.

Select all from max date

Good morning,
I am writing a SQL query for the latest metal prices with the latest date they were put into the database. Example table below:
ID Date Created
1 01/01/01 01:01
2 01/01/01 01:02
3 01/01/01 01:03
4 01/01/01 01:04
1 02/01/01 01:01
2 02/01/01 01:02
So from this I want the following result:
ID Date Created
1 02/01/01 01:01
2 02/01/01 01:02
When I run the below query it is just giving me the last one entered into the date base so from the above example it would be ID 2 DateCreated 02/01/01 01:02. The query I am using is below:
SELECT mp.MetalSourceID, ROUND(mp.PriceInPounds,2),
mp.UnitPrice, mp.HighUnitPrice, mp.PreviousUnitPrice,
mp.PreviousHighUnitPrice, ms.MetalSourceName,
ms.UnitBasis, cu.Currency
FROM tblMetalPrice AS mp
INNER JOIN tblMetalSource AS ms
ON tblMetalPrice.MetalSourceID = tblMetalSource.MetalSourceID
INNER JOIN tblCurrency AS cu
ON tblMetalSource.CurrencyID = tblCurrency.CurrencyID
WHERE DateCreated = (SELECT MAX (DateCreated) FROM tblMetalPrice)
GROUP BY mp.MetalSourceID;
Could anyone please help its driving me crazy not knowing and my brain is dead this friday morning.
Thanks
Use a correlated subquery for the where clause:
WHERE DateCreated = (SELECT MAX(DateCreated) FROM tblMetalPrice mp2 WHERE mp2.id = mp.id)
You can join on a subquery, and I don't think you'll need the group by, or indeed the where clause (because that's handled by the join).
SELECT mp.MetalSourceID,
ROUND(mp.PriceInPounds,2),
mp.UnitPrice,
mp.HighUnitPrice,
mp.PreviousUnitPrice,
mp.PreviousHighUnitPrice,
ms.MetalSourceName,
ms.UnitBasis,
cu.Currency
FROM tblMetalPrice AS mp
INNER JOIN tblMetalSource AS ms
ON tblMetalPrice.MetalSourceID = tblMetalSource.MetalSourceID
INNER JOIN tblCurrency AS cu
ON tblMetalSource.CurrencyID = tblCurrency.CurrencyID
INNER JOIN (SELECT ID,MAX(DateCreated) AS maxdate FROM tblMetalPrice GROUP BY ID) AS md
ON md.ID = mp.ID
AND md.maxdate = mp.DateCreated
with maxDates as
(select max(datecreated) maxd, ids grp , count(1) members from s_tableA group by ids having count(1) > 1)
select ids, datecreated from s_tableA,maxDates
where maxd = datecreated and ids = grp;
this query will give your desired result. Correlated sub queries tend to consume lot of processing time, because for each row of the outer query it has to process all the rows in the inner query.

Complicated sql query - select all landlords' properties and subtract costs in one query

I have a table data structure similar to this:
Landlord table
Id Name Email
```````````````````````
1 J Johnson ...
2 R Kelly ...
Property table
Id Address Rent LandlordId
```````````````````````````````````
1 .... 400 1
2 .... 600 1
3 .... 750 2
Maintenance table
Id Details Cost MaintenanceDate PropertyId
`````````````````````````````````````````````````````
1 .... 25 20/12/2012 1
2 .... 120 22/12/2012 2
3 .... 35 24/12/2012 3
Essentially, a Landlord has multiple properties.. Each month, I need to produce an invoice for the landlord which includes all his properties, all the maintenance done on his property. To calculate how much I need to pay to the landlord, I need to sum all his properties' Rent and subtract that with the sum of all his maintenance costs for that month.
So, amount payable to landlord L = Sum(Rent of properties of L) - Sum(maintenance costs on all properties of L during this month)
I am using telerik reporting, thought I could achieve it with some clever grouping but that was a waste of my time so I am now going to try and achieve this with SQL and sub reports instead.
The SQL query that I'm trying is this:
SELECT l.Name, p.[Address], p.Rent, c.Details, c.Cost,
(select Rent From Property where Id = p.Id) -
(select SUM(cost) from CarriedOutJobs where PropertyId = p.Id)
as PayableToLandlord
FROM Landlord l JOIN PROPERTY p ON p.LandlordId = l.Id
LEFT OUTER JOIN Maintenance c ON c.PropertyId = p.Id
ORDER BY l.Fullname
This doesn't seem to work properly as it produces multiple fields
Because I am going to breakup the report into sub report, I thought I would just get landlord details first but I still need to calculate the amount payable to Landlord even in this case.. So I rewrote the query to this:
SELECT distinct l.Name,
(SELECT SUM(Rent) FROM PROPERTY WHERE LandlordId = l.Id) -
(SELECT COALESCE(SUM(cost), 0)
FROM CarriedOutJobs WHERE PropertyId = p.Id) AS PayableToLandlord
FROM Landlord l
JOIN PROPERTY p ON p.LandlordId = l.Id
ORDER BY l.Fullname
I thought this worked okay, but even though I have used distinct, this seems to produce a duplicate row with different PayableToLandlord amount and I can't seem to figure out why.
Is there a way to select all landlords, their properties, and the amount payable to them in one query, please?
I have removed date where clause for the sake of simplicity here.
Thanks.
I suggest keep it simple and use sub-queries as these are easy to work out what's going on. Hopefully you can then do your subtraction etc either in the report or quite simply as additional return columns.
Any date parameters you should pass in as variables in to the sub-queries (probably).
SELECT Landlord.Id as LandlordId, Landlord.Name,
ISNULL(TotalRent,0) AS TotalRent,
ISNULL(TotalCost,0) AS TotalCost
FROM Landlord
LEFT JOIN
(SELECT SUM(Rent) as TotalRent, LandLordId
FROM Property
GROUP BY LandLordId) Rents
ON Landlord.Id = Rents.LandlordId
LEFT JOIN
(SELECT LandLordId,SUM(Cost) AS TotalCost
FROM Property
INNER JOIN Maintenance
ON Property.Id = Maintenance.PropertyId
GROUP BY LandLordId
) MaintenanceCosts
ON Landlord.Id = MaintenanceCosts.LandlordId
ORDER BY Landlord.Name
Just noticed you also wrote:
Is there a way to select all landlords, their properties, and the amount payable to them in one query, please?
You can further join in the list of properties but you'll end up repeating the sum on each line. This is fine as long as you don't try to SUM it within your reporting package.
If you do want the property details then my suggestion is to pull in each property rent and cost and then do the SUM / net figure in the reporting package.
Here is a shorter query: Please comment.
select l.name, x.propid,
sum(x.tot) as finalrent
from landlord l
left join
(select p.landlordid, (p.rent - m.cost) as tot,
p.id as propid
from property p left join
maintenance m
on p.id = m.propertyid) as x
on
l.id = x.landlordid
group by l.id, x.propid
;
NAME PROPID FINALRENT
J Johnson 1 375
J Johnson 2 480
R Kelly 3 715
25/12/2012
Edit to add SQLFIDDLE as well as results group by landlord only
Query shows Total rent after cost is deducted.
SQLFIDDLE DEMO
Query:
-- group by landlord only
select l.id, l.name,
sum(x.tot) as finalrent,
sum(COALESCE(x.cost,0)) as TotalCost
from landlord l
left join
(select p.landlordid,
(p.rent - COALESCE(m.cost,0)) as tot,
p.id as propid, m.cost
from property p left join
maintenance m
on p.id = m.propertyid) as x
on l.id = x.landlordid
group by l.id
;
Results:
ID NAME FINALRENT TOTALCOST
1 R Johnson 880 120
2 R Kelly 715 35
27/12/2012
Joel's answer is very good. There is, however, a somewhat simpler way to express this. You can do the summaries in the subqueries at the property level, and then summarize at the landlord level in the outer query:
SELECT l.Id as LandlordId, l.Name,
sum(p.rent) as TotalRent,
coalesce(sum(p.rent), 0) AS TotalRent,
coalesce(sum(m.PropertyCost), 0) AS TotalCost
FROM Landlord l left outer join
Property p
on l.LandlordId = p.LandlordId left outer join
(SELECT PropertyId, SUM(Cost) AS PropertyCost
FROM Maintenance m
group by PropertyId
) m
on m.PropertyId = p.PropertyId
group by l.LandLordId, l.LandlordName
ORDER BY Landlord.Name
This saves a join in the second subquery.

How can I join these 3 tables

I have 3 tables:
Trip Promotion Promotion Cost.
1 ---- M 1 --------- M
Sample data include:
TripID TripName Date
XYZ123 Hawaii 09/06/09
YTU574 Japan 09/09/09
GHR752 US 11/07/09
PromotionID TripID Name
1 XYZ123 Poster
2 XYZ123 Brochure
3 GHR752 TV ad
CostID PromotionID Cost
1 1 $50
2 1 $100
3 1 $120
4 3 $2000
5 2 $500
I'm trying to build a query like this:
TripID Number of Promotions Total Cost
XYZ123 2 $770
GHR752 1 $2000
What I have is this:
SELECT
Trip.TripID, Count(Trip.TripID) AS [Number Of Promotions], Sum(PromotionCost.Cost) AS SumOfCost
FROM
Trip
INNER JOIN
(Promotion
INNER JOIN
PromotionCost ON Promotion.PromotionID = PromotionCost.PromotionID
) ON Trip.TripID = Promotion.TripID
GROUP BY
Trip.TripID;
And it gives me something like this:
TripID Number of Promotions Total Cost
XYZ123 4 $770
GHR752 1 $2000
I'm not sure why the Number of Promotions is messed up like that for the first one (XYZ123). It seems that somehow the JOIN is affecting it because if I use this:
SELECT
Trip.TripID, Count(Trip.TripID) AS [Number Of Promotions],
FROM
Trip
INNER JOIN
Promotion ON Trip.TripID = Promotion.TripID
GROUP BY
Trip.TripID;
It gives me the right number of promotions which is just 2.
You can add up the cost for each promotion in a subquery. That way, you only get one row for each promotion, and COUNT works to calculate the number of promotions per trip. For example:
select
t.TripId
, count(p.PromotionId) as [Number of Promotions]
, sum(pc.PromotionCost) as [Total Cost]
from trip t
left join promotions p on p.TripId = t.TripId
left join (
select
PromotionId
, PromotionCost = sum(cost)
from Promotions
group by PromotionId
) pc on pc.PromotionId = p.PromotionId
group by t.TripId
In case MS Access does not allow subqueries, you can store the subquery in a view, and join on that.
You can try to compensate for the duplicate Promotion rows by using COUNT(DISTINCT):
SELECT Trip.TripID, Count(DISTINCT Promotion.PromotionID) AS [Number Of Promotions],
Sum(PromotionCost.Cost) AS SumOfCost
FROM Trip INNER JOIN Promotion ON Trip.TripID = Promotion.TripID
INNER JOIN PromotionCost ON Promotion.PromotionID = PromotionCost.PromotionID
GROUP BY Trip.TripID;
What's going on is that by default, COUNT() counts the rows produced after all joins have been done. There are four promotion costs for TripID XYZ123, so four rows, even though the TripId occurs multiple times among those four rows.
It's easier to visualize if you try a similar query without the GROUP BY:
SELECT Trip.TripID, Promotion.PromotionID, PromotionCost.Cost
FROM Trip INNER JOIN Promotion ON Trip.TripID = Promotion.TripID
INNER JOIN PromotionCost ON Promotion.PromotionID = PromotionCost.PromotionID;
You'll see the four rows for XYZ123 (with duplicate PromotionID values), and one row for GHR752.
Re comments that MS Access doesn't support COUNT(DISTINCT): if that's the case, then you shouldn't do this in a single query. Do it in two queries:
SELECT Trip.TripID, SUM(PromotionCost.Cost) AS SumOfCost
FROM Trip INNER JOIN Promotion ON Trip.TripID = Promotion.TripID
INNER JOIN PromotionCost ON Promotion.PromotionID = PromotionCost.PromotionID
GROUP BY Trip.TripID;
SELECT Trip.TripID, Count(Promotion.PromotionID) AS [Number Of Promotions]
FROM Trip INNER JOIN Promotion ON Trip.TripID = Promotion.TripID
GROUP BY Trip.TripID;
The alternative is a very convoluted solution using subqueries, described in this article at Microsoft:
http://blogs.msdn.com/access/archive/2007/09/19/writing-a-count-distinct-query-in-access.aspx
Not the answer to your question but a useful recommendation (I hope): convert your query into a view by using the visual designer of SQL Server Management Studio, and examine the generated SQL code. You don't have to actually keep and use the generated view, but it is a good way of learning by example. I do that whenever I'm struggled with a complex query.
EDIT. Shame on me, I hand't read the tags: the question is MS-Access related, not SQL Server related. Anyway I think that my advice is still valid as far as concept-learning is the concern, since the SQL syntax is similar.