How to ensure outer join with filter still returns all desired rows? - sql

Imagine I have two tables in a DB like so:
products:
product_id name
----------------
1 Hat
2 Gloves
3 Shoes
sales:
product_id store_id sales
----------------------------
1 1 20
2 2 10
Now I want to do a query to list ALL products, and their sales, for store_id = 1. My first crack at it would be to use a left join, and filter to the store_id I want, or a null store_id, in case the product didn't get any sales at store_id = 1, since I want all the products listed:
SELECT name, coalesce(sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id
WHERE store_id = 1 or store_id is null;
Of course, this doesn't work as intended, instead I get:
name sales
---------------
Hat 20
Shoes 0
No Gloves! This is because Gloves did get sales, just not at store_id = 1, so the WHERE clause has filtered them out.
How then can I get a list of ALL products and their sales for a specific store?
Here are some queries to create the test tables:
create temp table test_products as
select 1 as product_id, 'Hat' as name;
insert into test_products values (2, 'Gloves');
insert into test_products values (3, 'Shoes');
create temp table test_sales as
select 1 as product_id, 1 as store_id, 20 as sales;
insert into test_sales values (2, 2, 10);
UPDATE: I should note that I am aware of this solution:
SELECT name, case when store_id = 1 then sales else 0 end as sales
FROM test_products p
LEFT JOIN test_sales s ON p.product_id = s.product_id;
however, it is not ideal... in reality I need to create this query for a BI tool in such a way that the tool can simply add a where clause to the query and get the desired results. Inserting the required store_id into the correct place in this query is not supported by this tool. So I'm looking for other options, if there are any.

Add the WHERE condition to the LEFT JOIN clause to prevent that rows go missing.
SELECT p.name, coalesce(s.sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id
AND s.store_id = 1;
Edit for additional request:
I assume you can manipulate the SELECT items? Then this should do the job:
SELECT p.name
,CASE WHEN s.store_id = 1 THEN coalesce(s.sales, 0) ELSE NULL END AS sales
FROM products p
LEFT JOIN sales s USING (product_id)
Also simplified the join syntax in this case.

I'm not near SQL, but give this a shot:
SELECT name, coalesce(sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id AND store_id = 1
You don't want a where on the whole query, just on your join

Related

How to get all rows from one table which have all relations?

I have 3 tables:
companies (id, name)
union_products (id, name)
products (id, company_id, union_product_id, price_per_one_product)
I need to get all companies which have products with union_product_id in (1,2) and total price of products (per company) is less than 100.
What I am trying to do now:
select * from "companies" where exists
(
select id from "products"
where "companies"."id" = "products"."company_id"
and "union_product_id" in (1, 2)
group by id
having COUNT(distinct union_product_id) = 2 AND SUM(price_per_one_product) < 100
)
The problem I stuck with is that I'm getting 0 rows from the query above, but it works if I'll change COUNT(distinct union_product_id) = 2 to 1.
DB fiddle: https://www.db-fiddle.com/f/iRjfzJe2MTmnwEcDXuJoxn/0
Try to join the three tables as the following:
SELECT C.id, C.name FROM
products P JOIN union_products U
ON P.union_product_id=U.id
JOIN companies C
ON P.company_id=C.id
WHERE P.union_product_id IN (1, 2)
GROUP BY C.id, C.name
HAVING COUNT(DISTINCT P.union_product_id) = 2 AND
SUM(P.price_for_one_product) < 100
ORDER BY C.id
See a demo.
SELECT c.name FROM "companies" c
JOIN "products" p ON c.id = p.company_id
WHERE union_product_id IN (1, 2) AND price_for_one_product < 100
GROUP BY c.name
HAVING COUNT(DISTINCT p.name) =2
This would provide you all the company(s) name(s) which has provides both union_product_id 1 and 2 and the price_for_one_product/ price_per_one_product is less than 100.
Note: You might need to change price_for_one_product with price_per_one_product, as in question you have used price_per_one_product but db-fiddle link table defination has used price_for_one_product.

LEFT OUTER JOIN with 'field IS NULL' in WHERE works as INNER JOIN

Today I've faced some unexplainable (for me) behavior in PostgreSQL — LEFT OUTER JOIN does not return records for main table (with nulls for joined one fields) in case the joined table fields are used in WHERE expression.
To make it easier to grasp the case details, I'll provide an example. So, let's say we have 2 tables: item with some goods, and price, referring item, with prices for the goods in different years:
CREATE TABLE item(
id INTEGER PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE price(
id INTEGER PRIMARY KEY,
item_id INTEGER NOT NULL,
year INTEGER NOT NULL,
value INTEGER NOT NULL,
CONSTRAINT goods_fk FOREIGN KEY (item_id) REFERENCES item(id)
);
The table item has 2 records (TV set and VCR items), and the table price has 3 records, a price for TV set in years 2000 and 2010, and a price for VCR for year 2000 only:
INSERT INTO item(id, name)
VALUES
(1, 'TV set'),
(2, 'VCR');
INSERT INTO price(id, item_id, year, value)
VALUES
(1, 1, 2000, 290),
(2, 1, 2010, 270),
(3, 2, 2000, 770);
-- no price of VCR for 2010
Now let's make a LEFT OUTER JOIN query, to get prices for all items for year 2010:
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN price p ON i.id = p.item_id
WHERE p.year = 2010 OR p.year IS NULL;
For some reason, this query will return a results only for TV set, which has a price for this year. Record for VCR is absent in results:
id | name | year | value
----+--------+------+-------
1 | TV set | 2010 | 270
(1 row)
After some experimenting, I've found a way to make the query to return results I need (all records for item table, with nulls in the fields of joined table in case there are no mathing records for the year. It was achieved by moving year filtering into a JOIN condition:
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN (
SELECT * FROM price
WHERE year = 2010 -- <= here I filter a year
) p ON i.id = p.item_id;
And now the result is:
id | name | year | value
----+--------+------+-------
1 | TV set | 2010 | 270
2 | VCR | |
(2 rows)
My main question is — why the first query (with year filtering in WHERE) does not work as expected, and turns instead into something like INNER JOIN?
I'm severely blocked by this issue on my current project, so I'll be thankful about tips/hints on the next related questions too:
Are there any other options to achieve the proper results?
... especially — easily translatable to Django's ORM queryset?
Upd: #astentx suggested to move filtering condition directly into JOIN (and it works too):
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN price p
ON
i.id = p.item_id
AND p.year = 2010;
Though, the same as my first solution, I don't see how to express it in terms of Django ORM querysets. Are there any other suggestions?
The first query does not work as expected because expectation is wrong. It does not work as INNER JOIN as well. The query returns a record for VCR only if there is no price for VCR at all.
SELECT
i.*,
y.year,
p.value
FROM item i
CROSS JOIN (SELECT 2010 AS year) y -- here could be a table
LEFT OUTER JOIN price p
ON (p.item_id = i.id
AND p.year = y.year);

Oracle SQL - Sum up and show text if > 0

I'm totally new with Oracle SQL and my question may seem stupid but I have some difficulties to solve my problem.
Current situation:
I have following tables: Supplier, Debtor, Invoice
Every Supplier has various Debtors and every Debtor has various Invoices.
I want to create an evaluation which shows me the following scenario:
A list of all Debtors from a Supplier. I also want to see in that list if the Debtor once haven't payed his invoice. I have this information within the table Invoice as the attribute called "payed" and possible values are 0 (payed) and 1 (not payed). I just want to see the debtor ONCE in the list so if there is only ONE invoice which is 1, it should show me "1" or "Not Payed" in the list. Right now when a debtor has 100 invoices it shows me 100 times the debtor with the info "0" or "1".
Currently:
SELECT company.company_id,
company.companyname_1,
supplier.supplier_id,
supplier.suppliername_1,
debtor.debtor_id_from_supplier,
debtor_ext.debtorname_1,
debtor_ext.street,
debtor_ext.street_number,
debtor_ext.postcode,
debtor_ext.city,
debtor.approved_limit,
debitor.limit_left,
debitor.limit_status,
debitor.limit_type,
debitor.prosecution,
CASE
WHEN invoice.payed = 0 THEN 'Yes'
ELSE 'No'
END as deb_payment
FROM debtor,
debtor_ext,
company,
supplier,
invoice
WHERE ( company.company_id = supplier.company_id ) and
( supplier.supplier_id = debtor.supplier_id ) and
( debtor.debtor_id = debtor_ext.debtor_id ) and
( debtor.supplier_id = invoice.supplier_id ) and
( debtor.debtor_id_from_supplier = invoice.debtor_id_from_supplier )
CODE CORRECTED!
Hope you guys can help me
First of all, In your query, you're doing cross join which is not required and also affects on the performance. I'll advice you to use Inner Join here.
Also,as the schema is not provided in question,I am giving solution for the below schema:
Supplier:
Supplier_Id (PK) | Supplier_Name
Debtor:
Debtor_Id (PK) | Debtor_Name | Supplier_Id (FK)
Invoice:
Invoice_Id (PK) | Supplier_Id (FK) | Debtor_Id (FK) | Amount | Payed
Query:
Select company.company_id,
company.companyname_1,
s.supplier_id,
s.suppliername_1,
d.debtor_id_from_supplier,
d.debtorname_1,
d.street,
d.street_number,
d.postcode,
d.city,
d.approved_limit,
d.limit_left,
d.limit_status,
d.limit_type,
d.prosecution,
tmp.deb_payment
From
(
SELECT d.supplier_id,
d.debtor_id,
(
CASE
WHEN min(i.payed) <> max(i.payed) Then 'Not Payed'
ELSE 'Payed'
END
)as deb_payment
FROM supplier s
inner join debtor d
on s.supplier_id = d.supplier_id
inner join invoice i
on i.supplier_id = d.supplier_id
and i.debtor_id = d.debtor_id
Group by d.supplier_id,d.debtor_id
) tmp
inner join supplier s
on s.supplier_id = tmp.supplier_id
inner join company
on company.company_id = s.company_id
inner join debtor d
on s.supplier_id = d.supplier_id
inner join debtor_ext de
d.debtor_id = de.debtor_id
;
Hope it helps!

Ambiguous column name error : SQL count, join, group by

I have two tables as below, one table has number of available units (stock), I'm trying to return the stock count of each product category, and joining it with secondary table to view the description and price etc.
When I run the below query, I get "Ambiguous column name 'productID'."
What am I doing wrong?
SQL Query:
select productID, count (stock)as available_count
from product_units
join product_type ON product_type.description = product_units.productID
group by productID
This returns an error:
Ambiguous column name 'productID'.
Table product_type
productID description price
101 tent 20.00
102 xltent 50.00
Table product_units
unitID productID stock
1 101 1
2 101 1
3 101 1
4 102 1
Orginal SQL query to get stock count, which works:
select productID, count (stock)as available_count
from product_units
group by productID
I'm using SQL Server 2008 R2 with Coldfusion
I think your error is more likely "Ambiguous column name 'productID'." And, I'm guessing the join should be on that field as well:
select product_units.productID, count (stock)as available_count
from product_units
join product_type ON product_type.productID = product_units.productID
group by product_units.productID
To select all rows from the product_type table use a right outer join:
select product_units.productID, count (stock)as available_count
from product_units
right outer join product_type ON product_type.productID = product_units.productID
group by product_units.productID
To select all information from the product type table, do the aggregation first and then join:
select pt.*, pu.available_count
from (select productId, count(stock) as available_count
from product_units
group by productId
) pu join
product_type pt
on pt.productID = pu.productId;

Query Returning SUM of Quantity or 0 if no Quantity for a Product at a given date

I have the following 4 tables:
----------
tblDates:
DateID
30/04/2012
01/05/2012
02/05/2012
03/05/2012
----------
tblGroups:
GroupID
Home
Table
----------
tblProducts:
ProductID GroupID
Chair Home
Fork Table
Knife Table
Sofa Home
----------
tblInventory:
DateID ProductID Quantity
01/05/2012 Chair 2
01/05/2012 Sofa 1
01/05/2012 Fork 10
01/05/2012 Knife 10
02/05/2012 Sofa 1
02/05/2012 Chair 3
03/05/2012 Sofa 2
03/05/2012 Chair 3
I am trying to write a query that returns all Dates in tblDates, all GroupIDs in tblGroups and the total Quantity of items in each GroupID.
I manage to do this but only get Sum(Quantity) for GroupID and DateID that are not null. I would like to get 0 instead. For exemple for the data above, I would like to get a line "02/05/2012 Table 0" as there is no data for any product in "Table" group on 01/05/12.
The SQL query I have so far is:
SELECT tblDates.DateID,
tblProducts.GroupID,
Sum(tblInventory.Quantity) AS SumOfQuantity
FROM (tblGroup
INNER JOIN tblProducts ON tblGroup.GroupID = tblProducts.GroupID)
INNER JOIN (tblDates INNER JOIN tblInventory ON tblDates.DateID = tblInventory.DateID) ON tblProducts.ProductID = tblInventory.ProductID
GROUP BY tblDates.DateID, tblProducts.GroupID;
I reckon I should basically have the same Query on a table different from tblInventory that would list all Products instead of listed products with 0 instead of no line but I am hesitant to do so as given the number of Dates and Products in my database, the query might be too slow.
There is probably a better way to achieve this.
Thanks
To get all possible Groups and Dates combinations, you'll have to CROSS JOIN those tables. Then you can LEFT JOIN to the other 2 tables:
SELECT
g.GroupID
, d.DateID
, COALESCE(SUM(i.Quantity), 0) AS Quantity
FROM
tblDates AS d
CROSS JOIN
tblGroups AS g
LEFT JOIN
tblProducts AS p
JOIN
tblInventory AS i
ON i.ProductID = p.ProductID
ON p.GroupID = g.GroupID
AND i.DateID = d.DateID
GROUP BY
g.GroupID
, d.DateID
use the isnull() function. change your query to:
SELECT tblDates.DateID, tblProducts.GroupID,
**ISNULL( Sum(tblInventory.Quantity),0)** AS SumOfQuantity
FROM (tblGroup INNER JOIN tblProducts ON tblGroup.GroupID = tblProducts.GroupID)
INNER JOIN (tblDates INNER JOIN tblInventory ON tblDates.DateID = tblInventory.DateID)
ON tblProducts.ProductID = tblInventory.ProductID
GROUP BY tblDates.DateID, tblProducts.GroupID;
You don't say which database you're using but what I'm familiar with is Oracle.
Obviously, if you inner join all tbales, you will not get the rows containing nulls. If you use an outer join to the inventory table you also have a problem because the groupid column is not listed in tblinventory and you can only outer join to one table.
I'd say you have two choices. Use a function or have a duplicate of the groupid column in the inventory table.
So, a nicer but slower solution:
create or replace
function totalquantity( d date, g varchar2 ) return number as
result number;
begin
select nvl(sum(quantity), 0 )
into result
from tblinventory i, tblproducts p
where i.productid = p.productid
and i.dateid = d
and p.groupid = g;
return result;
end;
select dateid, groupid, totalquantity( dateid, groupid )
from tbldates, tblgroups
Or, if you include the groupid column in the inventory table. (You can still quarantee integrity using constraints)
select d.dateid, i.groupid, nvl(i.quantity, 0)
from tbldates d, tblinventory i
where i.dateid(+) = d.dateid
group by d.dateid, g.groupid
Hope this helps,
Gísli