FULL OUTER JOIN Not Working As Expected ON Two Equalities - sql

I want to combine the two tables below in Big Query using a full outer join. Table A does not have certain products that I need to bring over from table B, but when I join on campaign & subcampaign, my join is not bringing over the 'CellPhone' data. My results looks more like a left join. See below for my query
SELECT
a.campaign
, a.subcampaign
, a.product
, sum(sales)
, sum(cost)
FROM
(
SELECT
campaign
, subcampaign
, product
, sum(sales)
FROM
table_a
GROUP BY
1, 2, 3
) a
FULL OUTER JOIN
(
SELECT
campaign
, subcampaign
, product
, sum(cost)
FROM
table_b
GROUP BY 1,2,3
) b
ON
a.campaign = b.campaign
AND a.subcampaign = b.subcampaign
GROUP BY
1,2,3
Table a
Campaign
Subcampaign
Product
Sales
Campaign 1
Store 581
Gaming
$50
Campaign 1
Store 583
TV
$100
Table b
Campaign
Subcampaign
Product
Cost
Campaign 1
Store 581
Gaming
$25
Campaign 1
Store 583
TV
$75
Campaign 1
Store 584
Cellphone
$10
Desired result:
Campaign
Subcampaign
Product
Sales
Cost
Campaign 1
Store 581
Gaming
$50
$25
Campaign 1
Store 583
TV
$100
$75
Campaign 1
Store 584
Cellphone
NULL
$10

I think the problem is likely your select clause, not the join.
I suspect the confusion is that you are select a.campaign (etc.) even in cases where the join is not matching anything in table_a. If there is no match in table_a, a.campaign/a.subcampaign/a.product will all be null.
You probably want something more like the following in your outer query:
SELECT
COALESCE(a.campaign, b.campaign)
, COALESCE(a.subcampaign, b.subcampaign)
, COALESCE(a.product, b.product)
, sum(sales)
, sum(cost)
[...]
GROUP BY
COALESCE(a.campaign, b.campaign)
, COALESCE(a.subcampaign, b.subcampaign)
, COALESCE(a.product, b.product)
This way, if a.campaign (etc.) is null, it will fall back on b.campaign. This is safe, since we know that if both have values they must be equal.

You are aggregating before joining the two tables, which is why you might not be getting any null values in the result. Try selecting all values from the join and then aggregating to get the result you want like below:
SELECT
ab.campaign, ab.subcampaign, ab.product, SUM(sales), SUM(cost)
FROM (
SELECT *
FROM
table_a
FULL OUTER JOIN
table_b
ON
a.campaign = b.campaign
AND a.subcampaign = b.subcampaign ) ab
GROUP BY
1,2,3

Related

How to fix group by logic in subquery?

I have the following 2 example queries plus their result tables (dummy data) below:
SELECT
subs.Region
,subs.Product
,SUM(p.Price) TotalPriceA
FROM dbo.submission_dtl subs
JOIN dbo.price_dtl p ON subs.SubmissionNumber = p.SubmissionNumber
GROUP BY subs.Region, subs.Product
Region
Product
TotalPriceA
USA
cameras
200
USA
phones
300
Canada
cameras
300
Canada
phones
500
SELECT
r.Region
,r.Product
,SUM(rp.Price) TotalPriceB
FROM dbo.report_dtl r
JOIN dbo.report_price rp ON r.SubmissionNumber = rp.SubmissionNumber
GROUP BY r.Region, rp.Product
Region
Product
TotalPriceB
USA
cameras
201
USA
phones
301
Canada
cameras
301
Canada
phones
501
I want to join them so that the result table resembles this:
Region
Product
TotalPriceA
TotalPriceB
USA
cameras
200
201
USA
phones
300
301
Canada
cameras
300
301
Canada
phones
500
501
But when I used this query, I got a result table that resembled this:
SELECT
subs.Region
,subs.Product
,SUM(p.Price) TotalPriceA
,rptotal.TotalPriceB
FROM dbo.submission_dtl subs
JOIN dbo.price_dtl p ON subs.SubmissionNumber = p.SubmissionNumber
JOIN
(
SELECT
r.Product
,SUM(rp.Price) TotalPriceB
FROM dbo.report_dtl r
JOIN dbo.report_price rp ON r.SubmissionNumber = rp.SubmissionNumber
GROUP BY rp.Product
) rptotal on subs.Product = rptotal.Product
GROUP BY subs.Region, subs.Product, rptotal.TotalPriceB
Region
Product
TotalPriceA
TotalPriceB
USA
cameras
200
502
USA
phones
300
802
Canada
cameras
300
502
Canada
phones
500
802
When I group the subquery by region as well, I get even worse results...
You can try to use two subquery before join
SELECT t1.Region,
t1.Product,
t2.TotalPriceA,
t1.TotalPriceB
FROM (
SELECT
r.Region
,r.Product
,SUM(rp.Price) TotalPriceB
FROM dbo.report_dtl r
JOIN dbo.report_price rp ON r.SubmissionNumber = rp.SubmissionNumber
GROUP BY r.Region, rp.Product
) t1 INNER JOIN (
SELECT
subs.Region
,subs.Product
,SUM(p.Price) TotalPriceA
FROM dbo.submission_dtl subs
JOIN dbo.price_dtl p ON subs.SubmissionNumber = p.SubmissionNumber
GROUP BY subs.Region, subs.Product
) t2 ON t1.Region = t2.Region AND t1.Product = t2.Product
Perhaps a group by is not what is required here, at least not for the final result. Have you considered using the pivot clause instead? As DRapp stated, you might need a union to combine the two queries. Your group by is only required to sumarise the total values before hand, but the pivot should take care of that.
In this example, I'm using a table variable to consolidate all the information and then the pivot. Take a closer look and you'll realise that one of the columns is having a constant all the time for each query. Also, from experience I know that table variables work better with null columns, regardless of the actual data source.
Declare #myData Table (
region varchar(max) null,
product varchar(max) null,
type varchar(max) null,
totalPriceA money null
)
--The type is the constant to know whether it's A or B
Insert Into #myData(region, product, type, totalPrice)
Select Region, Product, 'TotalPriceA', Sum(Price)
From <your tables here>
Group By region, product
--Repeat for total B.
Insert Into #myData(region, product, type, totalPrice)
Select Region, Product, 'TotalPriceB', Sum(Price)
From <your tables here>
Group By region, product
--Now myData table has all the information.
--You only need the output format
Select region, product, TotalPriceA, TotalPriceB
From #myData
Pivot (
Sum(totalPrice)
For type In ('TotalPriceA', 'TotalPriceB')
) As Result
Hope this helps. As you can see, the constant values in column type become the column titles in the final result. You will get null values if one "cell" in the final table doesn't have a corresponding value for that row/column match.

Access subquery for Distinct Count and Count on same table

Hello I have browsed the forum for a while and am asking my first question here. I'm in a bit of a bind and was wondering if I could get some help out. I am using Access and have not found a good answer to the question on the Net yet.
I have a table called tblTransactions for transactions on Access 2013. It looks like this:
Transaction_ID
Customer_No
Prod_ID
Lıcence_ID
1
111
1
1
2
111
1
2
3
222
1
2
4
111
2
1
5
222
2
1
6
222
2
2
7
333
1
1
tblProd looks like:
Prod_ID
Prod_Name
Prod_Price
1
Prod 1
30
2
Prod 2
50
tblLicence looks like:
Lıcence_ID
Lıcence_Name
Lıcence_Price
1
Lıcence 1
80
2
Lıcence 2
100
The customer purchases the product once and may obtain multiple licenses for this product. The product is paid once, but for all licenses owned.
I want to create a summary list for transactions. I cannot print how many different prod it has and how many licenses it has in total next to the customer number.
The output I want should look like this:
Customer_No
Count_Uniq_Prods
Count_Licences
Sum_Prods_Price
Sum_Licences_Price
111
2
3
80
260
222
2
3
80
280
333
1
1
30
80
I tried different methods for the first 3 columns.
When I try with subquery, the Customer number and product count are correct, but it also removes duplicates from licenses.
SELECT C.Customer_No, T2.Count_Uniq_Prods, T2.Count_Licences" & _
FROM" & _
(SELECT T1.Customer_No, T1.Count_Uniq_Prods, Count(Lıcence_ID) As Count_Licences"
FROM" & _
(SELECT DISTINCT Customer_No, Lıcence_ID, Count(Prod_ID) As Count_Uniq_Prods
FROM tblTransactions GROUP BY Customer_No, Lıcence_ID ) AS T1
GROUP BY T1.Customer_No, T1.Count_Uniq_Prods) AS T2
INNER JOIN tblTransactions AS C
ON T2.Customer_No = C.Customer_No" & _
GROUP BY C.Customer_No, T2.Count_Uniq_Prods, T2.Count_Licences;
When I try the left join operation, I can successfully get results for the product and license separately, but when I want to get it in a single table, the results are not what I want.
It's work for Customer_No, Count_Uniq_Prods, Sum_Prods_Price:
SELECT T.Customer_No,
Count(T.Prod_ID), SUM(tblProd.Prod_Price) AS Sum_Prods_Price
FROM ((SELECT DISTINCT Customer_No, Prod_ID FROM tblTransactions ) AS T
LEFT JOIN tblProd ON tblProd.Prod_ID= T.Prod_ID)
GROUP BY T.Customer_No;
It's work for Customer_No, Count_Licences, Sum_Licences_Price:
SELECT T.Customer_No,
Count(T.Lıcence_ID), SUM(tblLicence.[Lıcence_Price]) AS Sum_Licences_Price
FROM ((SELECT Customer_No, Lıcence_ID FROM tblTransactions ) AS T
LEFT JOIN tblLicence ON tblLicence.Lıcence_ID = T.Lıcence_ID)
GROUP BY T.Customer_No
But when I take one as a subquery inside the other, I cannot reach the desired result in both results.
I hope I was able to explain clearly. Thanks in advance for any help.
This should work for you.
My approach was to take your problem and break it down into its constituent elements.
This query retrieves the products in the format that you
requested.
select customer_no, count(t.prod_id) as Count_Uniq_Prods,
sum(p.prod_price) as Sum_Prods_Price
from (
select customer_no, prod_id
from tblTransactions t
group by customer_no, prod_id
) t
inner join tblProd p on
t.prod_id = p.prod_id
group by customer_no
This query retrieves the licenses in the format that you
requested.
select customer_no, count(t. license_id) as Count_Licenses,
sum(l.license_price) as Sum_Licenses_Price
from tblTransactions t
inner join tblLicense l on
t.license_id = l.license_id
group by customer_no
Finally, we put them together and get the following:
select distinct p.customer_no, Count_Uniq_Prods,
Count_Licenses,
Sum_Prods_Price,
Sum_Licenses_Price
from (
select customer_no, count(t.prod_id) as Count_Uniq_Prods,
sum(p.prod_price) as Sum_Prods_Price
from (
select customer_no, prod_id
from tblTransactions t
group by customer_no, prod_id
) t
inner join tblProd p on
t.prod_id = p.prod_id
group by customer_no
) p
inner join (
select customer_no, count(t. license_id) as Count_Licenses,
sum(l.license_price) as Sum_Licenses_Price
from tblTransactions t
inner join tblLicense l on
t.license_id = l.license_id
group by customer_no
) l
on p.customer_no = l.customer_no

Most popular pairs of shops for workers from each company

I've got 2 tables, one with sales and one with companies:
Sales Table
Transaction_Id Shop_id Sale_date Client_ID
92356 24234 11.09.2018 12356
92345 32121 11.09.2018 32121
94323 24321 11.09.2018 21231
94278 45321 11.09.2018 42123
Company table
Client_ID Company_name
12345 ABC
13322 ABC
32321 BCD
22221 BCD
What I want to achieve is distinct count of Clients from each Company for each pair of shops(Clients who had at least 1 transaction in both of shops) :
Shop_Id_1 Shop_id_2 Company_name Count(distinct Client_id)
12356 12345 ABC 31
12345 14278 ABC 23
14323 12345 BCD 32
14278 12345 BCD 43
I think that I have to use self join, but my queries even with filter for one week is killing DB, any thoughts on that? I'm using Microsoft SQL server 2012.
Thanks
I think this is a self-join and aggregation, with a twist. The twist is that you want to include the company in each sales record, so it can be used in the self-join:
with sc as (
select s.*, c.company_name
from sales s join
companies c
on s.client_id = c.client_id
)
select sc1.shop_id, sc2.shop_id, sc1.company_name, count(distinct sc1.client_id)
from sc sc1 join
sc sc2
on sc1.client_id = sc2.client_id and
sc1.company_name = sc2.company_name
group by sc1.shop_id, sc2.shop_id, sc1.company_name;
I think there are some issues with your question. I interpreted it as such that the company table contains the shop ID's, not the ClienId's.
First you can create a solution to get the shops as rows for each company. Here I chose a maximum of 5 shops per company. Don't forget the semicolon in the previous statement before the cte's.
WITH CTE_Comp AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY CompanyName ORDER BY ShopID) AS RowNumb
FROM Company AS C
)
SELECT C1.ShopID,
C2.ShopID AS ShopID_2,
C3.ShopID AS ShopID_3,
C4.ShopID AS ShopID_4,
C5.ShopID AS ShopID_5,
C1.CompanyName
INTO ShopsByCompany
FROM CTE_Comp AS C1
LEFT JOIN CTE_Comp AS C2 ON C1.CompanyName= C2.CompanyName AND RowNumb = 2
LEFT JOIN CTE_Comp AS C2 ON C1.CompanyName= C3.CompanyName AND RowNumb = 3
LEFT JOIN CTE_Comp AS C2 ON C1.CompanyName= C4.CompanyName AND RowNumb = 4
LEFT JOIN CTE_Comp AS C2 ON C1.CompanyName= C5.CompanyName AND RowNumb = 5
WHERE C1.RowNumb = 1
After that, in a few steps, I think you could get the desired result:
WITH ClientsPerShop AS
(
SELECT ShopID,
COUNT (DISTINCT ClientID) AS TotalClients
FROM Sales
GROUP BY ShopID
)
, ClienstsPerCompany AS
(
SELECT CompanyName,
SUM (TotalClients) AS ClientsPerComp
FROM Company AS C
INNER JOIN ClientsPerShop AS CPS ON C.ShopID = CPS.ShopID
GROUP BY CompanyName
)
SELECT *
FROM ClienstsPerCompany AS CPA
INNER JOIN ShopsByCompany AS SBC ON SBC.CompanyName = CPA.CompanyName
Hopefully this will bring you closer to your solution, best of luck!

How to ensure outer join with filter still returns all desired rows?

Imagine I have two tables in a DB like so:
products:
product_id name
----------------
1 Hat
2 Gloves
3 Shoes
sales:
product_id store_id sales
----------------------------
1 1 20
2 2 10
Now I want to do a query to list ALL products, and their sales, for store_id = 1. My first crack at it would be to use a left join, and filter to the store_id I want, or a null store_id, in case the product didn't get any sales at store_id = 1, since I want all the products listed:
SELECT name, coalesce(sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id
WHERE store_id = 1 or store_id is null;
Of course, this doesn't work as intended, instead I get:
name sales
---------------
Hat 20
Shoes 0
No Gloves! This is because Gloves did get sales, just not at store_id = 1, so the WHERE clause has filtered them out.
How then can I get a list of ALL products and their sales for a specific store?
Here are some queries to create the test tables:
create temp table test_products as
select 1 as product_id, 'Hat' as name;
insert into test_products values (2, 'Gloves');
insert into test_products values (3, 'Shoes');
create temp table test_sales as
select 1 as product_id, 1 as store_id, 20 as sales;
insert into test_sales values (2, 2, 10);
UPDATE: I should note that I am aware of this solution:
SELECT name, case when store_id = 1 then sales else 0 end as sales
FROM test_products p
LEFT JOIN test_sales s ON p.product_id = s.product_id;
however, it is not ideal... in reality I need to create this query for a BI tool in such a way that the tool can simply add a where clause to the query and get the desired results. Inserting the required store_id into the correct place in this query is not supported by this tool. So I'm looking for other options, if there are any.
Add the WHERE condition to the LEFT JOIN clause to prevent that rows go missing.
SELECT p.name, coalesce(s.sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id
AND s.store_id = 1;
Edit for additional request:
I assume you can manipulate the SELECT items? Then this should do the job:
SELECT p.name
,CASE WHEN s.store_id = 1 THEN coalesce(s.sales, 0) ELSE NULL END AS sales
FROM products p
LEFT JOIN sales s USING (product_id)
Also simplified the join syntax in this case.
I'm not near SQL, but give this a shot:
SELECT name, coalesce(sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id AND store_id = 1
You don't want a where on the whole query, just on your join

SQL Syntax Issue with getting sum

Ok I have two tables.
Table IDAssoc has the columnsbill_id, year, area_id.
Table Bill has the columns bill_id, year, main_id, and amount_due.
I'm trying to get the sum of the amount_due column from the bill table for each of the associated area_ids in the IDAssoc table.
I'm doing a select statement to select the sum and joining on the bill_ids. How can I set this up so it will have a single row for each of the associated bills in each area_id from the assoc table. There may be three or four bill_ids associated with each area_id and I need those summed for each and returned so I can use this select in another statement. I have a group by set up for the area_id but it still is returning each row and not summing them up for each area_id. I have the year and main_id specified already in the where clause to return the data that I want, but I can't get the sum to work properly. Sorry I'm still learning and I'm not sure how to do this. Thanks!
Edit- Basically the query I'm trying so far is basically just like the one posted below:
select a.area_id, sum(b.amount_due)
from IDAssoc a
inner join Bill b
on a.bill_id = b.bill_id
where Bill.year = 2006 and bill.bill_id = 11111
These are just arbitrary numbers.
The data this is returning is like this:
amount_due - area_id
.05 1003
.15 1003
.11 1003
65 1004
55 1004
I need one row returned for each area_id with the amount_due summed. The area_id is only in the assoc table and not in the bill table.
select a.area_id, sum(b.amount_due)
from IDAssoc a
inner join Bill b
on a.bill_id = b.bill_id
where b.year = 2006 and b.bill_id = 11111
group by a.area_id
You might want to change inner join to left join if one IDAssoc can have many or no Bill:
select a.area_id, coalesce(sum(b.amount_due),0)
from IDAssoc a
left join Bill b
on a.bill_id = b.bill_id
where b.year = 2006 and b.bill_id = 11111
group by a.area_id
You are missing the GROUP BY clause:
SELECT a.area_id, SUM(b.amount_due) TotalAmount
FROM IDAssoc a
LEFT JOIN Bill b
ON a.bill_id = b.bill_id
GROUP BY a.area_id