Row is listed twice in simple query - sql

One row is Repeated twice and I can't seem to figure out why. I tried using Group by but couldn't figure that out either lol. Using Left outer Join, to list suppliers who have a discounted product, in the Northwind Database
Select *
From Suppliers s
Left Outer Join products p
On s.SupplierID = p.SupplierID
Where p.Discontinued = 1

You have two discontinued products for a supplier, and a row is created for each row in products that matches the join condition and the Discontinued = 1 predicate. You want something like that:
SELECT * FROM Suppliers s
WHERE EXISTS (SELECT 1
FROM Products p
WHERE p.SupplierID = s.SupplierID
AND p.Discontinued = 1)

You have two rows in one of those tables. You can determine which by querying both tables by themselves on that supplierId.
Now, to your query, by putting p.discontinued in the where, that join effectively becomes an inner join, so you should either flip it to an inner join or move that condition to the join.
To get suppliers with discontinued products, you can do this:
Select * from supplier where supplierId in (
select supplierId from products
where discontinued =1)

There is clearly a supplier that has multiple discontinued products.
If you want suppliers with at least one discounted product, then use exists:
select s.*
from suppliers s
where exists (select 1
from products p
where p.supplierid = s.supplierid and
p.Discontinued = 1
);
If you want the list of suppliers with the number of discontinued products, use join:
select s.*, p.num_discontinued
from supplier s join
(select p.supplierid, count(*) as num_discontinued
from products
where p.Discontinued = 1
group by p.supplierid
) p
on p.SupplierID = s.SupplierID ;
If you want the list of products that are discontinued with their suppliers, than use your query but change the left join to an inner join. An outer join is not necessary.

Related

Self join and inner join to remove duplicates

I am stuck on this and I am relatively new to SQL.
Here is the question we were given:
List the productname and vendorid for all products that we have
purchased from more than one vendor (Hint: you’ll need a Self-Join and
an additional INNER JOIN to solve, don't forget to remove any
duplicates!!)
Here is a screenshot of tables we are working with:
Here is what I have.....I know it is wrong. It works to a degree, just not exactly how the prof wants it.
SELECT DISTINCT productname, product_vendors.vendorid
FROM products INNER JOIN Product_Vendors
ON products.PRODUCTNUMBER = PRODUCT_VENDORS.PRODUCTNUMBER
INNER JOIN vendors ON Product_Vendors.VENDORID = vendors.VENDORID
ORDER BY products.PRODUCTNAME;
Expected output provided the prof:
I agree with #jarlh that additional information would be helpful- i.e. are there triplicates in the data or just duplicates, etc.
That said, this should get your started
SELECT
c.productname AS 'Product'
,a.vendorid AS 'Vendor1'
,b.vendorid AS 'Vendor2'
FROM
product_vendors AS a
JOIN
product_vendors AS b
ON
a.productnumber = b.productnumber
AND a.vendorid <> b.vendorid
JOIN
dbo.products AS c
ON
a.productnumber = c.productnumber
This will limit the population of 'Product Vendors' just to products with unmatching vendors.
From there you are joining to products to pull back product name.
Also- work on coding format, clean code makes the dream work :)
The solution to this problem is usually to count vendors per product with COUNT OVER and only stick with products with more than one. Simply:
select productname, vendorid
from
(
select
p.productname,
pv.vendorid,
count(*) over (partition by product) as cnt
from products p
join product_vendors pv using (productnumber)
)
where cnt > 1;
If this shall be done without window functions, then one option is to aggregate product_vendors and use this result:
select p.productname, pv.vendorid
from
(
select productid
from product_vendors
group by productname
having count(*) > 1
) px
join products p using (productid)
join product_vendors pv using (productid);
or check whether exists another vendor for the product:
select
p.productname,
pv.vendorid,
count(*) over (partition by product) as cnt
from products p
join product_vendors pv on pv.productnumber = p.productnumber
where exists
(
select *
from product_vendors other
where other.productnumber = pv.productnumber
and other.vendorid <> pv.vendorid
);
In neither of these approaches I see the need to eliminate duplicates, as there should be one row per product in products and one row per product and vendor in product_vendors. So I guess what your prof was thinking of is:
select distinct
p.productname,
pv.vendorid
from products p
join product_vendors pv on pv.productnumber = p.productnumber
join product_vendors other on other.productnumber = pv.productnumber
and other.vendorid <> pv.vendorid
This, however, is an approach I don't recommend. You'd combine all vendors for a product (e.g. with 10 vendors for one product you already have 45 combinations for that product only, if I'm not mistaken). So you'd create a large intermediate result only to dismiss most of it with DISTINCT later. Don't do that. Remember: SELECT DISTINCT is often an indicator for a poorly written query (i.e. unnecessary joins leading to too many combinations you are not actually interested in).
SELECT DISTINCT p.name AS product, v.id
FROM products p
INNER JOIN product_vendors pv ON p.id = pv.productid
INNER JOIN product_vendors pv2 ON pv.productid = pv2.productid AND pv.vendorid != pv2.vendorid
INNER JOIN vendors v ON v.id = pv.vendorid
ORDER BY p.name

Cartesian products and selects in the from clause

I need to use a select in the from clause but I keep getting an Cartesian product.
select
customer.customer_name
,orders.order_date
,order_line.num_ordered
,order_line.quoted_price
,part.descript
,amt_billed
from (select order_line.num_ordered*part.price as amt_billed
from order_line
join part
on order_line.part_num = part.part_num
) billed
,customer
join orders
on customer.customer_num = orders.customer_num
join order_line
on orders.order_num = order_line.order_num
join part
on order_line.part_num = part.part_num;
Don't bother looking at the rest too hard. I already know that if I remove both the subselect in the from clause and amt_billed in the select clause I don't get the Cartesian product. What am I doing wrong that's causing the Cartesian product?
The reason for Cartesian product is, you didn't join the sub-select with orders or Part table.
First of all you don't need that sub-select
SELECT customer.customer_name,
orders.order_date,
order_line.num_ordered,
order_line.quoted_price,
part.descript,
order_line.num_ordered * part.price AS amt_billed
FROM customer
JOIN orders
ON customer.customer_num = orders.customer_num
JOIN order_line
ON orders.order_num = order_line.order_num
JOIN part
ON order_line.part_num = part.part_num;

Complex SQL query to get outputs from multiple tables

I am using information from multiple tables to try and get the output to be, Supplier ID, Supplier Name, Percent of orders with tracking number.
I need information from many tables, as follows:
Suppliers, this contains supplier ID and supplier name
SupplierSubscriptions, the supplier must have subscriptionid = 91, this contains supplier id.
SalesOrders, this contains each order for every supplier, so it contains supplier id.
Shipments, this contains the salesorder id, with each shipment associated to that order.
Packages, this contains the shipmentid and if the shipment is on the packages table, it contains a tracking number.
What I have so far is:
SELECT DISTINCT so.supplierID, count(*) AS NumberOfOrders FROM SalesOrders so
INNER JOIN suppliers s ON s.SupplierID = so.SupplierID
INNER JOIN SupplierSubscriptions ss ON s.SupplierID = ss.SupplierID
INNER JOIN shipments ship ON ship.SalesOrderID = so.SalesOrderID
INNER JOIN Packages p ON p.ShipmentID = ship.ShipmentID
WHERE ss.SubscriptionID = 91 GROUP BY so.SupplierID
this, obviously, is not what I am after, as it only shows the supplier id and a count of orders...
I'm guessing on parts of the schema, but something like this should work for you:
SELECT so.supplierID,
so.supplierName,
SUM(case when p.trackingNumber is not null then 1 else 0 end) / COUNT(0) as OrdersWithTracking
FROM SalesOrders so
INNER JOIN suppliers s ON s.SupplierID = so.SupplierID
INNER JOIN SupplierSubscriptions ss ON s.SupplierID = ss.SupplierID
INNER JOIN shipments ship ON ship.SalesOrderID = so.SalesOrderID
INNER JOIN Packages p ON p.ShipmentID = ship.ShipmentID
WHERE ss.SubscriptionID = 91
GROUP BY so.SupplierID, so.supplierName

Oracle: Using Sub-Queries, JOIN and distinct function together

Here is how I contructed the step-by-step:
M1. create a sub-query that will return CustomerId and total invoiced for that customer
M2. A second subquery that will give a list of distinct ProductIDs (along with product SKUs) and CustomerIDs.
M3. The M1 and M2 subqueries will be joined to make association between customer totals and products (for the same CustomerId).
M4. The query M3 will be fed to the final query that will just find the top 5 products.
I'm stuck on creating the distinct ProductID and customerID because they would have to be in aggregate functions in order to make them distinct.
Attached is an image that is the erwin diagram which helps understand the system.
If you can help me with M1-M4, I will greatly appreciate it. I'm not a programmer by trade but a business analyst.
--M1--
select C.CustomerId, COUNT(I.InvoiceId) TotalNumInvoices
from Customer C
JOIN Invoice I ON (I.CustomerId = C.CustomerId)
group by C.CustomerId
--M2: Incomplete--
select P.ProductID, P.SKU, C.CustomerID
from Product P
JOIN InvoiceLine IL ON (IL.ProductId = P.ProductId)
JOIN Invoice I ON (IL.InvoiceId = I.InvoiceId)
JOIN Customer C ON (C.CustomerId = I.CustomerId)
you can also use the DISTINCT keyword your select clause in order to get unique values. Try this for m2:
select DISTINCT p.productID, p.sku, i.customerID
from invoice i INNER JOIN invoiceLine il
ON i.invoiceID = il.invoiceID
JOIN product p
ON il.productID = p.productID;

Count of Detail record for a Master table

I have a Master Table and a Detail Table(like Categories and Products) in SQL server and some of the Categories Do not have Products.
I want to count Products of a Category and my Where condition is like this ProductID=100.
In Result i want to have 0 near the Categories that have not Products and the other Categories have Products count. the result must be only for ProductID=100 and the number off the Result is number off the Categories record.iwant to create a view and each time i run this query :
select * from -ViewName where ProductID=#newProductID
This could be done fairly simply in a query that doesn't use views - it would be something like:
select c.CategoryName, count(p.ProductID)
from Category c
left join Product p
on c.CategoryID = p.CategoryID and p.ProductID = 100
Note that the condition on ProductID has to be part of the join criteria, not in the where clause, otherwise the query will only return categories that include the specified product.
This could be done fairly inefficiently in a view, by using a cross join - something like:
create view vwCategoryProduct as
select c.CategoryName,
p.ProductID,
case when c.CategoryID = p.CategoryID then 1 else 0 end as ProductIncluded
from Category c
cross join Product p
- and then selecting from the view like so:
select * from vwCategoryProduct where ProductID = 100
Not sure I get you on all of that
but something like
Select Category_Name, IsNull(Count(Products.Category_ID),0)
From Categories
Outer join Products On Products.CategoryID = Categories.CategoryID
Where Products.ProductID = 100
Should get you away...
Try it this way:
select count(p.*) as 'Number of Products'
from Categories c
left outer join Products p on c.ProductID = p.ProductID and ProductID = 100