Supply a default value for an incomplete SQL join - sql

Sorry I don't know if there's proper name for 'incomplete join' but consider this sort of query, designed to return details of every sale:
SELECT a.id, a.productId, b.productDescription FROM Sales a, AdditionalProductData b WHERE a.productId = b.productId;
In a situation where AdditionalProductData doesn't guarantee to to have a row for every productId, but I want to return a result for every row in Sales, how can I modify my query to return either null or some default value e.g. "unknown" in such cases? I want to ensure sales of unregistered products are not omitted.
(It is a slightly contrived example, and indicates a problem in the DB, but those are outside scope of the question)

Use OUTER JOIN :
SELECT a.productId, b.productName
FROM Products b LEFT JOIN
Sales a
ON a.productId = b.id;
but I want to return a result for every row in Sales
Do the table swapping :
SELECT a.productId, COALESCE(b.productName, 'unknown') AS productName
FROM Sales a LEFT JOIN
Products b
ON a.productId = b.id;

Never use commas in the FROM clause. Always use proper, explicit, standard, readable JOIN syntax.
You want a left join:
SELECT p.id, p.productName
FROM Products p LEFT JOIN
Sales s
ON s.productId = p.id;
I am guessing you really want at least one row per product. It doesn't make sense that you would have sales for non-existent products -- at least in most databases.
The above doesn't make sense -- only selecting from one table. You probably want something like this:
SELECT p.id, p.productName, SUM(s.amount)
FROM Products p LEFT JOIN
Sales s
ON s.productId = p.id
GROUP BY p.id, p.productName;
EDIT:
If you really do want one row per sales, then you still want a LEFT JOIN just in the other order:
SELECT s.*, p.productName
FROM Sales s LEFT JOIN
Products p L
ON s.productId = p.id;

Related

Modify SQL query to include cases without a value

I have this database assignment where I need to write a query "to display the id and name of each category, with the number of products that belong to the category". I was able to solve it and used this query.
SELECT Category.Id, Category.Name, COUNT(Category.Name)
FROM Category, Product
WHERE (CategoryId = Category.Id)
GROUP BY Category.Id;
But I want to modify it to make all categories appear, even those with no products. Stuck on this part. Any help is appreciated.
You can left join:
select c.id, c.name, count(p.categoryid) cnt_products
from category c
left join product p on p.categoryid = c.id
group by c.id;
A correlated subquery is also a fine solution, which avoids outer aggregation:
select c.*,
(select count(*) from product p where p.categoryid = c.id) cnt_products
from category c

using subquery in order to join columns from two tables

i started learning SQL and there is something i dont understand
i want to take the columns product_id and product_name from Production.products
and join it with the quantity column from the Production.stocks table
but instead of using join i want to use a subquery.
this is the code i wrote so far:
and i don't understand why it isn't working :(
SELECT P.product_id, P.product_name,(
SELECT S.quantity
FROM Production.stocks AS S
WHERE S.product_id = P.product_id)
FROM Production.products as P;
First, let's clear up the fact that it is not recommended to use a subquery at all. Do it only for your own research reasons; if you have performance or code clarity in mind, go with the simple join.
When you make a subquery on the SELECT clause by enclosing it in parenthesis, you are forcing the result to be one single value. If not, you get the error you receive.
Usually, subqueries are used in the FROM clause, where they should be given a name and then represent a table. Like this:
SELECT P.product_id, P.product_name,S.quantity
FROM Production.products as P
inner join
(
SELECT quantity
FROM Production.stocks
) as S on S.product_id = P.product_id
You can see from the simplicity of the subquery of how little use it is.
You can use SUM keyword for prevent error and give you total quatity.
SELECT P.product_id, P.product_name,(
SELECT SUM(S.quantity)
FROM Production.stocks AS S
WHERE S.product_id = P.product_id)
FROM Production.products as P;
I don't see a need at all for a subquery.
If the products are unique entities then surely a join onto the stocks table and doing a sum on the quantity would be more beneficial in terms of query performance
SELECT
Production.Products.Product_id,
Production.Products.product_name,
SUM(Production.Stocks.quantity) AS Quantity
FROM
Production.Products
LEFT JOIN
Production.Stocks
ON
Production.Stocks.product_id = Production.Products.product_id
GROUP BY
Production.Products.product_id,
Production.Products.product_name
If you need it to quote stock quantities by store then you need to add an addition join onto stores and add the store to the select and group by clause like so
SELECT
Production.Products.Product_id,
Production.Products.product_name,
Sales.Stores.store_name,
SUM(Production.Stocks.quantity) AS Quantity
FROM
Production.Products
LEFT JOIN
Production.Stocks
ON
Production.Stocks.product_id = Production.Products.product_id
LEFT JOIN
Sales.Stores
ON
Production.Stocks.store_id = Sales.Stores.store_id
GROUP BY
Production.Products.product_id,
Production.Products.product_name,
Sales.Stores.store_name
Hope that helps
did you mean select all data from products table and show quantity column for each product?
SELECT P.product_id, P.product_name, isnull(s.quantity, 0) as Quantity
FROM Production.products as P
left join Production.stocks AS S
on p.product_id = S.product_id
if you have one to many relation with Products and Stocks you should use subquery like this
SELECT P.product_id, P.product_name, isnull(s.quantity, 0) as Quantity
FROM Production.products as P
left join (
select product_id, sum(quantity) as Quantity
from Production.stocks
group by product_id)
as S on p.product_id = S.product_id
it will be produced aggregated sum value for quantity field

Self join and inner join to remove duplicates

I am stuck on this and I am relatively new to SQL.
Here is the question we were given:
List the productname and vendorid for all products that we have
purchased from more than one vendor (Hint: you’ll need a Self-Join and
an additional INNER JOIN to solve, don't forget to remove any
duplicates!!)
Here is a screenshot of tables we are working with:
Here is what I have.....I know it is wrong. It works to a degree, just not exactly how the prof wants it.
SELECT DISTINCT productname, product_vendors.vendorid
FROM products INNER JOIN Product_Vendors
ON products.PRODUCTNUMBER = PRODUCT_VENDORS.PRODUCTNUMBER
INNER JOIN vendors ON Product_Vendors.VENDORID = vendors.VENDORID
ORDER BY products.PRODUCTNAME;
Expected output provided the prof:
I agree with #jarlh that additional information would be helpful- i.e. are there triplicates in the data or just duplicates, etc.
That said, this should get your started
SELECT
c.productname AS 'Product'
,a.vendorid AS 'Vendor1'
,b.vendorid AS 'Vendor2'
FROM
product_vendors AS a
JOIN
product_vendors AS b
ON
a.productnumber = b.productnumber
AND a.vendorid <> b.vendorid
JOIN
dbo.products AS c
ON
a.productnumber = c.productnumber
This will limit the population of 'Product Vendors' just to products with unmatching vendors.
From there you are joining to products to pull back product name.
Also- work on coding format, clean code makes the dream work :)
The solution to this problem is usually to count vendors per product with COUNT OVER and only stick with products with more than one. Simply:
select productname, vendorid
from
(
select
p.productname,
pv.vendorid,
count(*) over (partition by product) as cnt
from products p
join product_vendors pv using (productnumber)
)
where cnt > 1;
If this shall be done without window functions, then one option is to aggregate product_vendors and use this result:
select p.productname, pv.vendorid
from
(
select productid
from product_vendors
group by productname
having count(*) > 1
) px
join products p using (productid)
join product_vendors pv using (productid);
or check whether exists another vendor for the product:
select
p.productname,
pv.vendorid,
count(*) over (partition by product) as cnt
from products p
join product_vendors pv on pv.productnumber = p.productnumber
where exists
(
select *
from product_vendors other
where other.productnumber = pv.productnumber
and other.vendorid <> pv.vendorid
);
In neither of these approaches I see the need to eliminate duplicates, as there should be one row per product in products and one row per product and vendor in product_vendors. So I guess what your prof was thinking of is:
select distinct
p.productname,
pv.vendorid
from products p
join product_vendors pv on pv.productnumber = p.productnumber
join product_vendors other on other.productnumber = pv.productnumber
and other.vendorid <> pv.vendorid
This, however, is an approach I don't recommend. You'd combine all vendors for a product (e.g. with 10 vendors for one product you already have 45 combinations for that product only, if I'm not mistaken). So you'd create a large intermediate result only to dismiss most of it with DISTINCT later. Don't do that. Remember: SELECT DISTINCT is often an indicator for a poorly written query (i.e. unnecessary joins leading to too many combinations you are not actually interested in).
SELECT DISTINCT p.name AS product, v.id
FROM products p
INNER JOIN product_vendors pv ON p.id = pv.productid
INNER JOIN product_vendors pv2 ON pv.productid = pv2.productid AND pv.vendorid != pv2.vendorid
INNER JOIN vendors v ON v.id = pv.vendorid
ORDER BY p.name

Using sum with a nested select

I'm using SQL Server. This statement lists my products per menu:
SELECT menuname, productname
FROM [web].[dbo].[tblMenus]
FULL OUTER JOIN [web].[dbo].[tblProductsRelMenus]
ON [tblMenus].Id = [tblProductsRelMenus].MenuId
FULL OUTER JOIN [web].[dbo].[tblProducts]
ON [tblProductsRelMenus].ProductId = [tblProducts].ProductId
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].Id = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
Some products don't have menus and vice versa. I use the following to establish what has been sold of each product.
SELECT [tblProducts].ProductName, SUM([tblOrderDetails].Ammount) as amount
FROM [web].[dbo].[tblProducts]
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].ProductId = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
What I want to do is complement the top table with an amount column. That is, I want a table with the same number of rows as in the first table above but with an amount value if it exists, otherwise null.
I can't figure out how to do this. Any suggestions?
If I am not missing anything, the second query could be simplified, then incorporated into the first query like this:
SELECT
m.menuname,
p.productname,
t.amount
FROM [web].[dbo].[tblMenus] m
FULL JOIN [web].[dbo].[tblProductsRelMenus] pm ON m.Id = pm.MenuId
FULL JOIN [web].[dbo].[tblProducts] p ON pm.ProductId = p.ProductId
LEFT JOIN (
SELECT ProductId, SUM(Amount) as amount
FROM [web].[dbo].[tblOrderDetails]
GROUP BY ProductId
) t ON p.ProducId = t.ProductId

Left Join returns more records

I have 2 tables with related data.
one table is for products. and the other price. In price table one product may appear several times.
How can I return the result showing the products without containing duplicate rows
My Query is
select p.Product, sum(p.Qty), max(pr.netprice)
from Products p
left outer join Price pr
on p.Product=pr.Product
where p.brand=''
group by p.Product,pr.Product
but return more rows as right table have multiple records
please help
I don't think distinct is the way to go, I think group by should result in what you want if done correctly. Also, I don't think you need to group on values from both tables.. You should really understand what you want to. Give us example data and it will be easier to answer your question. Try this:
select p.Product, sum(p.Qty), max(pr.netprice)
from Products p
left outer join Price pr on p.Product = pr.Product
where p.brand = ''
group by p.Product -- only group on param.
Use the distinct keyword. That will remove duplicates. ALthough, if there are different prices for a given product, there will be one record per unique price per product if you remove the Max().
select DISTINCT p.Product, sum(p.Qty),max(pr.netprice)
from Products p
left outer join Price pr on p.Product=pr.Product
where p.brand='' group by p.Product,pr.Product
Try putting distinct in select. I am not sure it will work, but try it.
If you want the sum then use Tomas answer above. This will give you a unique product list with the total quantity and maximum price for each product
select p.Product, sum(p.Qty), max(pr.netprice)
from Products p
left outer join Price pr on p.Product = pr.Product
where p.brand = ''
group by p.Product
What about changing it this way:
SELECT p.Product, sum(p.Qty),
(SELECT max(pr.netprice)
FROM Price pr
WHERE p.Product=pr.Product
)
FROM Products p
WHERE p.brand=''
Try this
SELECT p.Product, p.Qty, MAX(pr.netprice)
FROM Products p
LEFT OUTER JOIN Price pr ON p.Product=pr.Product
WHERE p.brand=''
GROUP BY p.Product, p.Qty