How to fix group by logic in subquery? - sql

I have the following 2 example queries plus their result tables (dummy data) below:
SELECT
subs.Region
,subs.Product
,SUM(p.Price) TotalPriceA
FROM dbo.submission_dtl subs
JOIN dbo.price_dtl p ON subs.SubmissionNumber = p.SubmissionNumber
GROUP BY subs.Region, subs.Product
Region
Product
TotalPriceA
USA
cameras
200
USA
phones
300
Canada
cameras
300
Canada
phones
500
SELECT
r.Region
,r.Product
,SUM(rp.Price) TotalPriceB
FROM dbo.report_dtl r
JOIN dbo.report_price rp ON r.SubmissionNumber = rp.SubmissionNumber
GROUP BY r.Region, rp.Product
Region
Product
TotalPriceB
USA
cameras
201
USA
phones
301
Canada
cameras
301
Canada
phones
501
I want to join them so that the result table resembles this:
Region
Product
TotalPriceA
TotalPriceB
USA
cameras
200
201
USA
phones
300
301
Canada
cameras
300
301
Canada
phones
500
501
But when I used this query, I got a result table that resembled this:
SELECT
subs.Region
,subs.Product
,SUM(p.Price) TotalPriceA
,rptotal.TotalPriceB
FROM dbo.submission_dtl subs
JOIN dbo.price_dtl p ON subs.SubmissionNumber = p.SubmissionNumber
JOIN
(
SELECT
r.Product
,SUM(rp.Price) TotalPriceB
FROM dbo.report_dtl r
JOIN dbo.report_price rp ON r.SubmissionNumber = rp.SubmissionNumber
GROUP BY rp.Product
) rptotal on subs.Product = rptotal.Product
GROUP BY subs.Region, subs.Product, rptotal.TotalPriceB
Region
Product
TotalPriceA
TotalPriceB
USA
cameras
200
502
USA
phones
300
802
Canada
cameras
300
502
Canada
phones
500
802
When I group the subquery by region as well, I get even worse results...

You can try to use two subquery before join
SELECT t1.Region,
t1.Product,
t2.TotalPriceA,
t1.TotalPriceB
FROM (
SELECT
r.Region
,r.Product
,SUM(rp.Price) TotalPriceB
FROM dbo.report_dtl r
JOIN dbo.report_price rp ON r.SubmissionNumber = rp.SubmissionNumber
GROUP BY r.Region, rp.Product
) t1 INNER JOIN (
SELECT
subs.Region
,subs.Product
,SUM(p.Price) TotalPriceA
FROM dbo.submission_dtl subs
JOIN dbo.price_dtl p ON subs.SubmissionNumber = p.SubmissionNumber
GROUP BY subs.Region, subs.Product
) t2 ON t1.Region = t2.Region AND t1.Product = t2.Product

Perhaps a group by is not what is required here, at least not for the final result. Have you considered using the pivot clause instead? As DRapp stated, you might need a union to combine the two queries. Your group by is only required to sumarise the total values before hand, but the pivot should take care of that.
In this example, I'm using a table variable to consolidate all the information and then the pivot. Take a closer look and you'll realise that one of the columns is having a constant all the time for each query. Also, from experience I know that table variables work better with null columns, regardless of the actual data source.
Declare #myData Table (
region varchar(max) null,
product varchar(max) null,
type varchar(max) null,
totalPriceA money null
)
--The type is the constant to know whether it's A or B
Insert Into #myData(region, product, type, totalPrice)
Select Region, Product, 'TotalPriceA', Sum(Price)
From <your tables here>
Group By region, product
--Repeat for total B.
Insert Into #myData(region, product, type, totalPrice)
Select Region, Product, 'TotalPriceB', Sum(Price)
From <your tables here>
Group By region, product
--Now myData table has all the information.
--You only need the output format
Select region, product, TotalPriceA, TotalPriceB
From #myData
Pivot (
Sum(totalPrice)
For type In ('TotalPriceA', 'TotalPriceB')
) As Result
Hope this helps. As you can see, the constant values in column type become the column titles in the final result. You will get null values if one "cell" in the final table doesn't have a corresponding value for that row/column match.

Related

FULL OUTER JOIN Not Working As Expected ON Two Equalities

I want to combine the two tables below in Big Query using a full outer join. Table A does not have certain products that I need to bring over from table B, but when I join on campaign & subcampaign, my join is not bringing over the 'CellPhone' data. My results looks more like a left join. See below for my query
SELECT
a.campaign
, a.subcampaign
, a.product
, sum(sales)
, sum(cost)
FROM
(
SELECT
campaign
, subcampaign
, product
, sum(sales)
FROM
table_a
GROUP BY
1, 2, 3
) a
FULL OUTER JOIN
(
SELECT
campaign
, subcampaign
, product
, sum(cost)
FROM
table_b
GROUP BY 1,2,3
) b
ON
a.campaign = b.campaign
AND a.subcampaign = b.subcampaign
GROUP BY
1,2,3
Table a
Campaign
Subcampaign
Product
Sales
Campaign 1
Store 581
Gaming
$50
Campaign 1
Store 583
TV
$100
Table b
Campaign
Subcampaign
Product
Cost
Campaign 1
Store 581
Gaming
$25
Campaign 1
Store 583
TV
$75
Campaign 1
Store 584
Cellphone
$10
Desired result:
Campaign
Subcampaign
Product
Sales
Cost
Campaign 1
Store 581
Gaming
$50
$25
Campaign 1
Store 583
TV
$100
$75
Campaign 1
Store 584
Cellphone
NULL
$10
I think the problem is likely your select clause, not the join.
I suspect the confusion is that you are select a.campaign (etc.) even in cases where the join is not matching anything in table_a. If there is no match in table_a, a.campaign/a.subcampaign/a.product will all be null.
You probably want something more like the following in your outer query:
SELECT
COALESCE(a.campaign, b.campaign)
, COALESCE(a.subcampaign, b.subcampaign)
, COALESCE(a.product, b.product)
, sum(sales)
, sum(cost)
[...]
GROUP BY
COALESCE(a.campaign, b.campaign)
, COALESCE(a.subcampaign, b.subcampaign)
, COALESCE(a.product, b.product)
This way, if a.campaign (etc.) is null, it will fall back on b.campaign. This is safe, since we know that if both have values they must be equal.
You are aggregating before joining the two tables, which is why you might not be getting any null values in the result. Try selecting all values from the join and then aggregating to get the result you want like below:
SELECT
ab.campaign, ab.subcampaign, ab.product, SUM(sales), SUM(cost)
FROM (
SELECT *
FROM
table_a
FULL OUTER JOIN
table_b
ON
a.campaign = b.campaign
AND a.subcampaign = b.subcampaign ) ab
GROUP BY
1,2,3

Complex SQL Query - Joining 5 tables with complex conditions

I have the following tables: Reservations, Order-Lines, Order-Header, Product, Customer. Just a little explanation on each of these tables:
Reservations Contains "reservations" for a billing customer/product combination.
Order-Lines Contains line item detail for orders, including the product they ordered and the qty.
Order-Header Contains header info for orders including the date, customer and billing customer
Product Contains product detail information
Customer Contains Customer detail information.
Below are the tables with their associated fields and sample data:
Reservation
bill-cust-key prod-key qty-reserved reserve-date
10000 20000 10 05/30/2014
10003 20000 5 06/20/2014
10003 20001 15 06/20/2014
10003 20001 5 06/25/2014
10002 20001 5 06/21/2014
10002 20002 20 06/21/2014
Order-Item
order-num cust-key prod-key qty-ordered
30000 10000 20000 10
30000 10000 20001 5
30001 10001 20001 10
30002 10001 20001 5
30003 10002 20003 20
Order-Header
order-num cust-key bill-cust-key order-date
30000 10000 10000 07/01/2014
30001 10001 10003 07/03/2014
30002 10001 10003 07/15/2014
30003 10002 10002 07/20/2014
Customer
cust-key cust-name
10000 Customer A
10001 Customer B
10002 Customer C
10003 Customer D
Product
prod-key prod-name
20000 Prod A
20001 Prod B
20002 Prod C
20003 Prod D
I am attempting to write a query that will show me customer/product combinations that exist in both the reservation and order-item tables. A little snafu is that we have a customer and a billing customer. The reservation and order-header tables contain both the customers, but the order-item table only contains the customer. The results should display the billing customer. Additionally, there can be several reservations and order-items for the same customer/product combination, so I would like to show a total sum of the qty-reserved and the qty-ordered.
Below is an example of my desired output:
bill-cust-key cust-name prod-key prod-name qty-ordered qty-reserved
10000 Customer A 20000 Prod A 10 10
10003 Customer D 20001 Prod B 15 20
This is the query that I have tried and doesn't seem to be working for me.
SELECT customer.cust-key, customer.cust-name, product.prod-key, prod.prod-name,
SUM(order-item.qty-ordered), SUM(reservation.qty-reserved)
FROM ((reservation INNER JOIN order-item on reservation.prod-key = order-item.product-key)
INNER JOIN order-header on reservation.bill-cust-key = order-header.bill-cust-key and
order-item.order-num = order-header.order-num), customer, product
WHERE customer.cust-key = reservation.bill-cust-key
AND product.prod-key = reservation.prod-key
GROUP BY customer.cust-key, customer.cust-name, product.prod-key, product.prod-name
I'm sorry for such a long post! I just wanted to make sure that I had my bases covered!
You want to join your tables like this:
from reservation res join order-header oh on res.bill-cust-key = oh.bill-cust-key
join order-item oi on oi.order-num = oh.order-num
and oi.prod-key = res.prod-key
/* join customer c on c.cust-key = oi.cust-key old one */
join customer c on c.cust-key = oh.bill-cust-key
join product p on p.prod-key = oi.prod-key
I find that it can be very helpful to separate out your output rows from your aggregate rows by using CROSS APPLY (or OUTER APPLY) or simply an aliased inner query if you don't have access to those.
For example,
SELECT
customer.cust-key,
customer.cust-name,
tDetails.prod-key,
tDetails.prod-name,
tDetails.qty-ordered,
tDetails.qty-reserved
FROM customer
--note that this could be an inner-select table in which you join if not cross-join
CROSS APPLY (
SELECT
product.prod-key,
prod.prod-name,
SUM(order-item.qty-ordered) as qty-ordered,
SUM(reservation.qty-reserved) as qty-reserved
FROM reservation
INNER JOIN order-item ON reservation.prod-key = order-item.product-key
INNER JOIN product ON reservation.prod-key = product.prod-key
WHERE
reservation.bill-cust-key = customer.cut-key
GROUP BY product.prod-key, prod.prod-name
) tDetails
There are many ways to slice this, but you started out the right way saying "what recordset do I want returned". I like the above because it helps me visualize what each 'query' is doing. The inner query marked by the CROSS apply is simply grouping by prod orders and reservations but is filtering by the current customer in the outer top-most query.
Also, I would keep joins out of the 'WHERE' clause. Use the 'WHERE' clause for non-primary key filtering (e.g. cust-name = 'Bob'). I find it helps to say that one is a table join, the 'WHERE' clause is a property filter.
TAKE 2 - using inline queries
This approach still tries to get a list of customers with distinct products, and then uses that data to form the outer query from which you can get aggregates.
SELECT
customer.cust-key,
customer.cust-name,
products.prod-key,
products.prod-name,
--aggregate for orders
( SELECT SUM(order-item.qty-ordered)
FROM order-item
WHERE
order-item.cust-key = customer.cust-key AND
order-item.prod-key = products.prod-key) AS qty-ordered,
--aggregate for reservations
( SELECT SUM(reservation.qty-reserved)
FROM reservations
--join up billingcustomers if they are different from customers here
WHERE
reservations.bill-cust-key = customer.cust-key AND
reservations.prod-key = products.prod-key) AS qty-reserved
FROM customer
--get a table of distinct products across orders and reservations
--join products table for name
CROSS JOIN (
SELECT DISTINCT order-item.prod-key FROM order-item
UNION
SELECT DISTINCT reservation.prod-key FROM reservations
) tDistinctProducts
INNER JOIN products ON products.prod-key = tDistinctProducts.prod-key
TAKE 3 - Derived Tables
According to some quick googling, Progress DB does support derived tables. This approach has largely been replaced with CROSS APPLY (or OUTER APPLY) because you don't need to do the grouping. However, if your db only supports this way then so be it.
SELECT
customer.cust-key,
customer.cust-name,
products.prod-key,
products.prod-name,
tOrderItems.SumQtyOrdered,
tReservations.SumQtyReserved
FROM customer
--get a table of distinct products across orders and reservations
--join products table for name
CROSS JOIN (
SELECT DISTINCT order-item.prod-key FROM order-item
UNION
SELECT DISTINCT reservation.prod-key FROM reservations
) tDistinctProducts
INNER JOIN products ON products.prod-key = tDistinctProducts.prod-key
--derived table for order-items
LEFT OUTER JOIN ( SELECT
order-item.cust-key,
order-item.prod-key,
SUM(order-item.qty-ordered) AS SumQtyOrdered
FROM order-item
GROUP BY
order-item.cust-key,
order-item.prod-key) tOrderItems ON
tOrderItems.cust-key = customer.cust-key AND
tOrderItems.prod-key = products.prod-key
--derived table for reservations
LEFT OUTER JOIN ( SELECT
reservations.bill-cust-key,
reservations.prod-key,
SUM(reservations.qty-reserved) AS SumQtyReserved
FROM reservations
--join up billingcustomers if they are different from customers here
WHERE
reservations.bill-cust-key = customer.cust-key AND
reservations.prod-key = products.prod-key) tReservations ON
tReservations.bill-cust-key = customer.cust-key AND
tReservations.prod-key = products.prod-key
Based on your original code and request, here's the starting point of a Progress solution -
DEFINE VARIABLE iQtyOrd AS INTEGER NO-UNDO.
DEFINE VARIABLE iQtyReserved AS INTEGER NO-UNDO.
FOR EACH order-item
NO-LOCK,
EACH order-header
WHERE order-header.order-num = order-item.order-num
NO-LOCK,
EACH reservation
WHERE reservation.prod-key = order-item.prod-key AND
reservation.bill-cust-key = order-header.bill-cust-key
NO-LOCK,
EACH product
WHERE product.prod-key = reservation.prod-key
NO-LOCK,
EACH customer
WHERE customer.cust-key = reservation.bill-cust-key
NO-LOCK
BREAK BY customer.cust-key
BY product.prod-key
BY product.prod-name
:
IF FIRST-OF(customer.cust-key) OR FIRST-OF(product.prod-key) THEN
ASSIGN
iQtyOrd = 0
iQtyReserved = 0
.
ASSIGN
iQtyOrd = iQtyOrd + reservation.qty-ordered
iQtyReserved = iQtyReserved + reservation.qty-reserved
.
IF LAST-OF(customer.cust-key) OR LAST-OF(product.prod-key) THEN
DISPLAY
customer.cust-key
customer.cust-name
product.prod-key
prod.prod-name
iQtyOrd
iQtyReserved
WITH FRAME f-qty
DOWN
.
END.

SQL Syntax Issue with getting sum

Ok I have two tables.
Table IDAssoc has the columnsbill_id, year, area_id.
Table Bill has the columns bill_id, year, main_id, and amount_due.
I'm trying to get the sum of the amount_due column from the bill table for each of the associated area_ids in the IDAssoc table.
I'm doing a select statement to select the sum and joining on the bill_ids. How can I set this up so it will have a single row for each of the associated bills in each area_id from the assoc table. There may be three or four bill_ids associated with each area_id and I need those summed for each and returned so I can use this select in another statement. I have a group by set up for the area_id but it still is returning each row and not summing them up for each area_id. I have the year and main_id specified already in the where clause to return the data that I want, but I can't get the sum to work properly. Sorry I'm still learning and I'm not sure how to do this. Thanks!
Edit- Basically the query I'm trying so far is basically just like the one posted below:
select a.area_id, sum(b.amount_due)
from IDAssoc a
inner join Bill b
on a.bill_id = b.bill_id
where Bill.year = 2006 and bill.bill_id = 11111
These are just arbitrary numbers.
The data this is returning is like this:
amount_due - area_id
.05 1003
.15 1003
.11 1003
65 1004
55 1004
I need one row returned for each area_id with the amount_due summed. The area_id is only in the assoc table and not in the bill table.
select a.area_id, sum(b.amount_due)
from IDAssoc a
inner join Bill b
on a.bill_id = b.bill_id
where b.year = 2006 and b.bill_id = 11111
group by a.area_id
You might want to change inner join to left join if one IDAssoc can have many or no Bill:
select a.area_id, coalesce(sum(b.amount_due),0)
from IDAssoc a
left join Bill b
on a.bill_id = b.bill_id
where b.year = 2006 and b.bill_id = 11111
group by a.area_id
You are missing the GROUP BY clause:
SELECT a.area_id, SUM(b.amount_due) TotalAmount
FROM IDAssoc a
LEFT JOIN Bill b
ON a.bill_id = b.bill_id
GROUP BY a.area_id

Sum of calculated field returns wrong result in MS Access query?

I have these 2 tables:
Table1:
CustomerID Area Type Revenue
1 Europe Institutional Clients 10
2 Asia Institutional Clients 10
3 USA Institutional Clients 10
Table2:
Report Country Type Rate
DK Institutional Clients 2
SE Institutional Clients 2
FI Institutional Clients 2
I want to make a query that joins the two tables and make a calculated field (Revenue*Rate). But when I use the MS Access query designer the sum of calculated field returns the wrong result.
Query version1:
This query returns 20 per customer (which is correct) and 60 in total, but the fields are not grouped into 1 row. (if I remove the fields CustomerID and Area I get 1 row, but result says 20?! Se version1B below)
SELECT t_Customer.CustomerID, t_Customer.Area, t_Customer.Type, [Revenue]*[Rate] AS CalculatedField
FROM t_Customer INNER JOIN t_Rate ON t_Customer.Type = t_Rate.Type
GROUP BY t_Customer.CustomerID, t_Customer.Area, t_Customer.Type, [Revenue]*[Rate];
Returns:
CustomerID Area Type CalculatedField
1 Europe Institutional Clients 20
2 Asia Institutional Clients 20
3 USA Institutional Clients 20
Query version1B: I remove the fields CustomerID and Area.
SELECT t_Customer.Type, ([Revenue]*[Rate]) AS CalculatedField
FROM t_Customer INNER JOIN t_Rate ON t_Customer.Type = t_Rate.Type
GROUP BY t_Customer.Type, ([Revenue]*[Rate]);
Returns:
Type CalculatedField
Institutional Clients 20
Query version2:
Here I add SUM of the Calculated field.
This query returns 180 (which is wrong).
SELECT t_Customer.Type, Sum(([Revenue]*[Rate])) AS CalculatedField
FROM t_Customer INNER JOIN t_Rate ON t_Customer.Type = t_Rate.Type
GROUP BY t_Customer.Type;
Returns:
Type CalculatedField
Institutional Clients 180
Is there a way to use the MS Access query designer to display the correct Sum of the calculated field, so I can have only 1 query for this purpose?
I know I could just make a new query on top of Query version1 that makes the correct sum. But I would like to avoid having 2 queries for this purpose.
SELECT t_Customer.CustomerID,
t_Customer.Area,
t_Customer.Type,
[Revenue] * [Rate] AS CalculatedField
FROM t_Customer
JOIN (SELECT DISTINCT Type, Rate
FROM t_rate) t_rate ON t_Customer.Type = t_Rate.Type
If you want it all one row then:
SELECT t_Customer.Type,
SUM([Revenue] * [Rate]) AS CalculatedField
FROM t_Customer
JOIN (SELECT DISTINCT Type, Rate
FROM t_rate) t_rate ON t_Customer.Type = t_Rate.Type
GROUP BY t_Customer.Type
Returns:
Type CalculatedField
Institutional Clients 60
Note that this change cannot be made with Access Query Designer (in Design Mode) you have to switch to SQL View.
Also note that the SELECT DISTINCT part can be typed both within parentheses like this
(SELECT DISTINCT Type,Rate FROM t_rate)
but Access will convert it to
[SELECT DISTINCT Type,Rate FROM t_rate].
when you save and edit the query again.
It produces the same result though. So it works just fine.

In SQL how do I write a query to return 1 record from a 1 to many relationship?

Let's say I have a Person table and a Purchases table with a 1 to many relationship. I want to run a single query that returns this person and just their latest purchase. This seems easy but I just can't seem to get it.
select p.*, pp.*
from Person p
left outer join (
select PersonID, max(PurchaseDate) as MaxPurchaseDate
from Purchase
group by PersonID
) ppm
left outer join Purchase pp on ppm.PersonID = pp.PersonID
and ppm.MaxPurchaseDate = pp.PurchaseDate
where p.PersonID = 42
This query will also show the latest purchase for all users if you remove the WHERE clause.
Assuming you have something like a PurchaseDate column and want a particular person (SQL Server):
SELECT TOP 1 P.Name, P.PersonID, C.PurchaseDescription FROM Persons AS P
INNER JOIN Purchases AS C ON C.PersonID = P.PersonID
WHERE P.PersonID = #PersonID
ORDER BY C.PurchaseDate DESC
Many Databases preform the "Limit or Top" command in different ways. Here is a reference http://troels.arvin.dk/db/rdbms/#select-limit and below are a few samples
If using SQL Server
SELECT TOP 1
*
FROM Person p
INNER JOIN Purchases pc on pc.PersonID = P.PersonID
Order BY pc.PurchaseDate DESC
Should work on MySQL
SELECT
*
FROM Person p
INNER JOIN Purchases pc on pc.PersonID = P.PersonID
Order BY pc.PurchaseDate DESC
LIMIT 1
Strictly off the top of my head!...If it's only one record then...
SELECT TOP 1 *
FROM Person p
INNER JOIN Purchases pu
ON p.ID = p.PersonId
ORDER BY pu.OrderDate
WHERE p.ID = *thePersonYouWant*
otherwise...
SELECT TOP 1 *
FROM Person p
INNER JOIN
(
SELECT TOP 1 pu.ID
FROM Purchases pu
ON pu.PersonID = p.Id
ORDER BY pu.OrderDate
) sq
I think! I haven't got access to a SQL box right now to test it on.
Without knowing your structure at all, or your dbms, you would order the results descending by the purchase date/time, and return only the first joined record.
Try TOP 1 With an order by desc on date. Ex:
CREATE TABLE #One
(
id int
)
CREATE TABLE #Many
(
id int,
[date] date,
value int
)
INSERT INTO #One (id)
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3
INSERT INTO #Many (id, [date], value)
SELECT 1, GETDATE(), 1 UNION ALL
SELECT 1, DATEADD(DD, 1 ,GETDATE()), 3 UNION ALL
SELECT 1, DATEADD(DD, -1 ,GETDATE()), 0
SELECT TOP 1 *
FROM #One O
JOIN #Many M ON O.id = M.id
ORDER BY [date] DESC
If you want to select the latest purchase for each person, that would be:
SELECT PE.ID, PE.Name, MAx(PU.pucrhaseDate) FROM Persons AS PE JOIN PURCHASE as PU ON PE.ID = PU.Person_ID
If you want to have all persons also those who have no purchases, you need to use LEFT JOIN.
I think you need one more table called Items for example.
The PERSONS table would uniquely define each person and all their attributes, while the ITEMS table would uniquely define each items and their attributes.
Assume the following:
Persons |Purchases |Items
PerID PerName |PurID PurDt PerID ItemID |ItemID ItemDesc ICost
101 Joe Smith |201 101107 101 301 |301 Laptop 500
|202 101107 101 302 |302 Desktop 699
102 Jane Doe |203 101108 102 303 |303 iPod 199
103 Jason Tut |204 101109 101 304 |304 iPad 499
|205 101109 101 305 |305 Printer 99
One Person Parent may tie to none, one or many Purchase Child.
One Item Parent may tie to none, one or many Purchase Child.
One or more Purchases Children will tie to one Person Parent, and one Item Parent.
select per.PerName as Name
, pur.PurDt as Date
, itm.ItemDesc as Item
, itm.ICost as Cost
from Persons per
, Purchases pur
, Items itm
where pur.PerID = per.PerID -- For that Person
and pur.ItemID = itm.ItemID -- and that Item
and pur.PurDt = -- and the purchase date is
( Select max(lst.PurDt) -- the last date
from Purchases lst -- purchases
where lst.PerID = per.PerID ) -- for that person
This should return:
Name Date Item Cost
Joe Smith 101109 Ipad 499
Joe Smith 101109 Printer 99
Jane Doe 101108 iPod 199