SQL make a list of not existing objects - sql

I have following problem.
Two tables. Product Table and Inventory Table.
In Product table are a list of all possible products and in Inventory the current stock of each store
e.g.
Inventory Table
ProductID Stock StoreID
1 10 1
2 10 1
3 10 1
1 10 2
Product Table
ProductID Product
1 Bananas
2 Apples
3 Oranges
4 Kiwi
What I want is a list of products that are not in stock for the stores.
Following the example the desired result would be
Store ProductID Product
1 4 Kiwi
2 2 Apples
2 3 Oranges
2 4 Kiwi
Now I tried several approaches from left joins, not in and not exist but haven't found a solution.
E.g.
SELECT *
FROM Inventory t1
left join Product t2 ON t2.ProductID = t1.ProductID
WHERE t2.ProductID IS NULL
But this returns nothing
Any help please.
Thank you

Solution to your problem:
FOR MySQL as well as MSSQL
SELECT i.storeId,p.ProductId, p.product
FROM Product p
CROSS JOIN (
SELECT Distinct storeId
FROM Inventory) i
LEFT JOIN Inventory iv
ON p.productID = iv.productId AND i.storeId = iv.storeId
WHERE iv.storeId IS NULL;
OUTPUT:
storeId ProductId product
1 4 Kiwi
2 2 Apples
2 3 Oranges
2 4 Kiwi
Follow the link to the demo:
http://sqlfiddle.com/#!9/71a854/9
CROSS JOIN:
The CROSS JOIN produced a result set which is the product of rows of two associated tables when no WHERE clause is used with CROSS JOIN.
In this join, the result set appeared by multiplying each row of the first table with all rows in the second table if no condition introduced with CROSS JOIN.
This kind of result is called as Cartesian Product.
Below picture will give you a more clear picture:
Source:https://www.w3resource.com/mysql/advance-query-in-mysql/mysql-cross-join.php

Your approach should work, although selecting columns from only the first table makes sense. And the first table should be the product table:
SELECT p.*
FROM Product p LEFT JOIN
Inventory i
ON i.ProductID = p.ProductID
WHERE i.ProductID IS NULL;
Your version returns all inventory products that are not in the Product table. That should not be possible, if your data is correctly formatted.
Another way to write the query uses NOT EXISTS:
select p.*
from product p
where not exists (select 1 from inventory i where i.productid = p.productid);
This version (which should have very good performance if you have an index on inventory(productid)) comes closer to how you expressed the question.

Related

Postgresql SUM with filter group column

I have this relational data model
product_table
ID
type
configurable_product_id
1
CONFIGURABLE
null
2
SIMPLE
1
3
SIMPLE
1
4
SIMPLE
1
product_source
ID
product_id
quantity
1
2
50
2
3
20
3
4
10
A table that contains the list of products, with two types: configurable and simple. The simple products are linked to a configurable product.
A product_source table that contains the quantities of the different products.
I would like to make a query to retrieve all the quantities of the products, in this way:
If configurable product, then the quantity is the sum of the quantities of the simple products
If simple product, then it is the quantity of the simple product only.
Here is the expected result with the above data:
result
product_id
quantity
1
80
2
50
3
20
4
10
Do you have any idea how to proceed ?
For the moment, I have thought about this type of request, but I don't know how to complete the 'CONFIGURABLE' case
SELECT
pr.id,
pr.type,
pr.configurable_product_id,
CASE
WHEN (pr.type = 'SIMPLE') THEN ps.quantity
WHEN (pr.type = 'CONFIGURABLE') THEN (????)
END AS quantity
FROM public."product" as pr
LEFT JOIN public."product_source" as ps
ON ps.product_id = pr.id
You can use window function and partition for calculating sum of quantity (it was like group by)
demo
SELECT
pr.id AS product_id,
CASE
WHEN (pr.type = 'SIMPLE') THEN ps.quantity
WHEN (pr.type = 'CONFIGURABLE') THEN SUM(ps.quantity) over ()
END AS quantity
FROM public."product" as pr
LEFT JOIN public."product_source" as ps
ON ps.product_id = pr.id
ORDER BY pr.id
Your (...) is:
(
SELECT SUM(ps2.quantity)
FROM product as pr2
LEFT JOIN product_source as ps2
ON ps2.product_id = pr2.id
WHERE pr2.configurable_product_id = pr.id
)

SQL Server: query multiple tables to create a pre-formated "template" csv for export into Excel

I am trying to create the SQL necessary to accomplish the following. As an aside, I am using server side Report Builder against a hosted SQL Server database so am limited in what I can do. There are 3 tables. A salesperson table (sales), an items table (items), and a transaction table (tx).
Here is an example:
Sales Person table (sales)
Person A
Person B
Items ID table (items)
10
20
30
100
200
300
1000
2000
3000
Transaction table (tx)
100 (person A)
300 (person B)
300 (person A)
200 (person B)
Desired result in Report Builder:
Person A
Item 100: 1
Item 200: 0 (NULL)
Item 300: 1
-- NEW PAGE --
Person B:
Item 100: 0 (NULL)
Item 200: 1
Item 300: 1
My problem: here is the SQL I came up with. I need to be able to generate a consistent result set, regardless of whether an item was sold or not by a particular salesperson for easier import into Excel. In addition, I am only looking for items whose code is between 100 and 300 and within a specified date range. My SQL is ignoring the date range and item code range. I originally had these instructions in a WHERE clause but it returned only those lines that were in both tables and I lost the placeholder for any itemcode where the value was null (acting as an INNER join). Within Report Builder I will be counting how many of each items were sold by salesperson.
SELECT
tx.date, sales.salesperson, items.itemcode
FROM
tx
LEFT OUTER JOIN
itemcode ON (tx.itemcode = items.itemcode)
AND (date BETWEEN "10/1/2017" AND "12/31/2017")
AND (itemcode BETWEEN "100" AND "300")
INNER JOIN
sales ON (tx.salesID = sales.salesID)
ORDER BY
itemcode ASC
Many thanks for any and all insight into my challenge!
If you want all sales people and all items, then you can generate the rows using a cross join. You can bring in the available data using left join or exists:
select s.person, i.itemcode,
(case when exists (select 1
from tx
where tx.salesid = s.salesid and tx.itemcode = i.itemcode
)
then 1 else 0
end) as has_sold
from sales s cross join
items i
where i.itemcode between 100 and 300
order by s.saledid, i.itemcode;
If you want the count of items, use left join and group by:
select s.person, i.itemcode, count(tx.salesid) as num_sold
from sales s cross join
items i left join
tx
on tx.salesid = s.salesid and tx.itemcode = i.itemcode
where i.itemcode between 100 and 300
order by s.saledid, i.itemcode;
Here's an example that uses both cross join (to get all combinations of sales and items) and left join (to get the transactions for given dates)
SELECT
tx.date, sales.salesperson, items.itemcode
FROM
Items
CROSS JOIN
sales
LEFT OUTER JOIN
tx ON (items.itemcode = tx.itemcode)
AND date BETWEEN '10/1/2017' AND '12/31/2017'
AND (tx.salesID = sales.salesID)
WHERE
(items.itemcode BETWEEN '100' AND '300')
ORDER BY
itemcode ASC

Optimized query for a subquery in sql

I made a query to get the inventory of products as follows:
Select
b.ProductID, c.ProductName,
(Select
Case
When SUM(Qty) IS NULL
then 0
else SUM(Qty)
end
from
InvoiceDetails
where
ProductID = b.ProductID) as Sold,
(Select
Case
When SUM(QtyReceive) IS NULL
then 0
else SUM(QtyReceive)
end
from
PurchaseOrderDetails
where
ProductID = b.ProductID) as Stocks,
((Select
Case
When SUM(QtyReceive) IS NULL
then 0
else SUM(QtyReceive)
end
from
PurchaseOrderDetails
where
ProductID = b.ProductID) -
(Select
Case
When SUM(Qty) IS NULL
then 0
else SUM(Qty)
end
from
InvoiceDetails
where
ProductID = b.ProductID)) as RemainingStock
from
InvoiceDetails a
Right join
PurchaseOrderDetails b on a.ProductID = b.ProductID
Inner join
Products c on b.ProductID = c.ProductID
Group By
b.ProductID, c.ProductName
This query returns the data that I want, and it runs fine in my desktop, but when I deploy the application that runs this query on a lower specs laptop, it is really slow and causes the laptop to hang. I need some help on how to optimize the query or maybe change it to make it more efficient... thanks in advance
This are the data of my InvoiceDetails table
Data From my PurchaseOrderDetails table
Data from Products table
So I've taken out your subqueries in the select, I don't think these were necessary at all. I've also moved around your joins and given better aliases to the tables;
SELECT
b.ProductID,
c.ProductName,
ISNULL(SUM(id.Qty),0) as Sold,
ISNULL(SUM(pod.QtyReceive),0) as Stocks,
ISNULL(SUM(pod.QtyReceive),0) - ISNULL(SUM(id.Qty),0) as RemainingStock
FROM PurchaseOrderDetails pod
INNER JOIN Products pr
ON pr.ProductID = pod.ProductID
LEFT JOIN InvoiceDetails id
ON id.ProductID = pod.ProductID
GROUP BY
pod.ProductID, pr.ProductName
You were already joining those two tables so you don't need subqueries in the select at all. I've also wrapped the SUM in ISNULL to ensure there are no NULL errors.
I'd suggest using the SET STATISTICS TIME,IO ON at the beginning of your code (with an OFF command at the end). Then copy all of the text from your 'messages' tab into statisticsparser.com. Do this for both queries and compare, check the total CPU time and the logical reads, you want these both lower for better performance. I'm betting your logical reads will drop significantly with this new query.
EDIT
OK, I've put together a new query based upon your sample data. I've only used the fields that we actually need for this query so that it's simpler for this example.
Sample Data
CREATE TABLE #InvoiceDetails (ProductID int, Qty int)
INSERT INTO #InvoiceDetails (ProductID,Qty)
VALUES (3,50),(1,0),(2,1),(1,12),(2,1),(3,1),(1,1),(2,1),(1,1),(2,1)
CREATE TABLE #PurchaseOrderDetails (ProductID int, Qty int)
INSERT INTO #PurchaseOrderDetails (ProductID, Qty)
VALUES (1,100),(2,20),(4,10),(1,12),(5,12),(4,12),(3,12),(2,20),(3,20),(4,20),(5,20)
CREATE TABLE #Products (ProductID int, ProductName varchar(20))
INSERT INTO #Products (ProductID, ProductName)
VALUES (1,'Sample Product'),(2,'DYE INK CYAN'),(3,'test Product 1'),(4,'test Product 2'),(5,'test Product 3'),(1004,'TESTING PRODUCT')
For this, here is the output of your original query
ProductID ProductName Sold Stocks RemainingStock
1 Sample Product 14 112 98
2 DYE INK CYAN 4 40 36
3 test Product 1 51 32 -19
4 test Product 2 0 42 42
5 test Product 3 0 32 32
This is the re-written query that I've used. Note, there are no subqueries within the SELECT statement, they're within the joins as they should be. Also see that as we're aggregating in the subqueries we don't need to do this in the outer query too.
SELECT
pod.ProductID,
pr.ProductName,
ISNULL(id.Qty,0) as Sold,
ISNULL(pod.Qty,0) as Stocks,
ISNULL(pod.Qty,0) - ISNULL(id.Qty,0) as RemainingStock
FROM #Products pr
INNER JOIN (SELECT ProductID, SUM(Qty) Qty FROM #PurchaseOrderDetails GROUP BY ProductID) pod
ON pr.ProductID = pod.ProductID
LEFT JOIN (SELECT ProductID, SUM(Qty) Qty FROM #InvoiceDetails GROUP BY ProductID) id
ON id.ProductID = pr.ProductID
And this is the new output
ProductID ProductName Sold Stocks RemainingStock
1 Sample Product 14 112 98
2 DYE INK CYAN 4 40 36
3 test Product 1 51 32 -19
4 test Product 2 0 42 42
5 test Product 3 0 32 32
Which matches your original query.
I'd suggest trying this query on your machines and seeing which performs better, try the STATISTICS TIME,IO command I mentioned previously.
You grouped by b.ProductID, c.ProductName then you could use aggregate function to calculate.
And create indexes in your table to improve performance.
Select
b.ProductID, c.ProductName,
SUM(isnull(a.Qty,0)) as Sold,
SUM(b.QtyReceive) as Stocks,
SUM(b.QtyReceive) - SUM(isnull(a.Qty,0)) as RemainingStock
from
PurchaseOrderDetails b
LEFT JOIN InvoiceDetails a on a.ProductID = b.ProductID
INNER JOIN Products c on b.ProductID = c.ProductID
Group By
b.ProductID, c.ProductName
Can you try this? (I wrote without testing, as you didn't post sample data nor create table). Please check it and use as a starting point. Compare results from your query and this and compare execution plan. Analysis of performances requires "some" knowledge of Sql and ability to consider several things (eg. how many rows, are there indexes, using of execution plan and statistics, etc.)
SELECT C.PRODUCTID
,C.PRODUCTNAME
,COALESCE(D.QTY_SOLD,0) AS QTY_SOLD
,COALESCE(E.QTY_STOCKS,0) AS QTY_STOCKS
,COALESCE(E.QTY_STOCKS,0)-COALESCE(D.QTY_SOLD,0) AS REMAININGSTOCK
FROM PRODUCTS C
LEFT JOIN (SELECT PRODUCTID, SUM(QTY) AS QTY_SOLD
FROM INVOICEDETAILS
GROUP BY PRODUCTID
) D ON B.PRODUCTID = D.PRODUCTID
LEFT JOIN (SELECT PRODUCTID,SUM(QTYRECEIVE) AS QTY_STOCKS
FROM PURCHASEORDERDETAILS
GROUP BY PRODUCTID
) E ON B.PRODUCTID = E.PRODUCTID
Looking to your query, I think this could be equivalent (or at least I hope it is):
Select
b.ProductID
, c.ProductName
, Case When SUM(a.Qty) IS NULL then 0 else SUM(a.Qty) end as sold
, Case When SUM(b.QtyReceive) IS NULL then 0 else SUM(b.QtyReceive) end as Stock
, Case When SUM(isnull(a.Qty,0 ) - isnull(b.QtyReceive,0)) IS NULL
then 0
else SUM(isnull(a.Qty,0 ) - isnull(b.QtyReceive,0)) end as RemainingStock
from Products c
left join InvoiceDetails a on c.ProductID = a.ProductID
left join PurchaseOrderDetails b on c.ProductID = b.ProductID
Group By b.ProductID,c.ProductName

SQL - Max Vs Inner Join

I have a question on which is a better method in terms of speed.
I have a database with 2 tables that looks like this:
Table2
UniqueID Price
1 100
2 200
3 300
4 400
5 500
Table1
UniqueID User
1 Tom
2 Tom
3 Jerry
4 Jerry
5 Jerry
I would like to get the max price for each user, and I am now faced with 2 choices:
Use Max or using Inner Join suggested in the following post:Getting max value from rows and joining to another table
Which method is more efficient?
The answer to your question is to try both methods, and see which performs faster on your data in your environment. Unless you have a large amount of data, the difference is probably not important.
In this case, the traditional method of group by is probably better:
select u.user, max(p.price)
from table1 u join
table2 p
on u.uniqueid = p.uniqueid
group by u.user;
For such a query, you want an index on table2(uniqueid, price), and perhaps on table1(uniqueid, user) as well. This depends on the database engine.
Instead of a join, I would suggest not exists:
select u.user, p.price
from table1 u join
table2 p
on u.uniqueid = p.uniqueid
where not exists (select 1
from table1 u2 join
table2 p2
on u2.uniqueid = p2.uniqueid
where p2.price > p.price
);
Do note that these do not do exactly the same things. The first will return one row per user, no matter what. This version can return multiple rows, if there are multiple rows with the same price. On the other hand, it can return other columns from the rows with the maximum price, which is convenient.
Because your data structure requires a join in the subquery, I think you should stick with the group by approach.

SQL query on two tables - return rows in one table that don't have entries in the other

I have two database tables, Categories and SuperCategories, for an inventory control system I'm working on:
Categories: ID_Category, CategoryName
SuperCategories: ID_SuperCategory, CategoryID, SuperCategoryID
I'm putting category-subcategory relationships into the SuperCategories table. I'm putting all categories into the Categories table.
Here is an example:
Categories:
ID_Category CategoryName
1 Box
2 Red Box
3 Blue Box
4 Blue Plastic Box
5 Can
6 Tin Can
SuperCategories:
ID_Super CategoryID SuperCategoryID
1 2 1
2 3 1
3 4 3
4 6 5
CategoryID and SuperCategoryID relate back to the primary key ID_Category in the Categories table.
What I would like is a query that returns all of the category names that are not parents of any other categories:
Red Box
Blue Plastic Box
Tin Can
This amounts to finding all values of ID_Category that do not show up in the SuperCategoryID column (2, 4, and 6), but I'm having trouble writing the SQL.
I'm using VB6 to query an Access 2000 database.
Any help is appreciated. Thanks!
EDIT: I voted up everyone's answer that gave me something that worked. I accepted the answer that I felt was the most instructive. Thanks again for your help!
Mike Pone's answer works, because he joins the "Categories" table with the "SuperCategories" table as a "LEFT OUTER JOIN" - this will take all entries from "Categories" and add columns from "SuperCategories" to those where the link exists - where it does not exist (e.g. where there is no entry in "SuperCategories"), you'll get NULLs for the SuperCategories columns - and that's exactly what Mike's query then checks for.
If you would write the query like so:
SELECT c.CategoryName, s.ID_Super
FROM Categories c
LEFT OUTER JOIN SuperCategories s ON c.ID_Category = s.SuperCategoryID
you would get something like this:
CategoryName ID_Super
Box 1
Box 2
Red Box NULL
Blue Box 3
Blue Plastic Box NULL
Can 4
Tin Can NULL
So this basically gives you your answer - all the rows where the ID_Super on the LEFT OUTER JOIN is NULL are those who don't have any entries in the SuperCategories table. All clear? :-)
Marc
SELECT
CAT.ID_Category,
CAT.CategoryName
FROM
Categories CAT
WHERE
NOT EXISTS
(
SELECT
*
FROM
SuperCategories SC
WHERE
SC.SuperCategoryID = CAT.ID_Category
)
Or
SELECT
CAT.ID_Category,
CAT.CategoryName
FROM
Categories CAT
LEFT OUTER JOIN SuperCategories SC ON
SC.SuperCategoryID = CAT.ID_Category
WHERE
SC.ID_Super IS NULL
I'll also make the suggestion that your naming standards could probably use some work. They seem all over the place and difficult to work with.
include only those categories that don't are not super cateogories. A simple outer join
select CategoryName from Categories LEFT OUTER JOIN
SuperCategories ON Categories.ID_Category =SuperCategories.SuperCategoryID
WHERE SuperCategories.SuperCategoryID is null
Not sure if the syntax will work for Access, but something like this would work:
select CategoryName from Categories
where ID_Category not in (
select SuperCategoryID
from SuperCategories
)
I always take the outer join approach as marc_s suggests. There is a lot of power when using OUTER JOINS. Often times I'll have to do a FULL OUTER JOIN to check data on both sides of the query.
You should also look at the ISNULL function, if you are doing a query where data can be in either table A or table B then I will use the ISNULL function to return a value from either column.
Here's an example
SELECT
isNull(a.[date_time],b.[date_time]) as [Time Stamp]
,isnull(a.[ip],b[ip]) as [Device Address]
,isnull(a.[total_messages],0) as [Local Messages]
,isnull(b.[total_messages],0) as [Remote Messages]
FROM [Local_FW_Logs] a
FULL OUTER JOIN [Remote_FW_Logs] b
on b.ip = a.ip
I have two tables interface_category and interface_subcategory.
Interface_subcategory contains SubcategoryID, CategoryID, Name(SubcategoryName)
Interface_category contains CategoryID, Name(CategoryName)
Now I want output CategoryID and Name(Subcategory name)
Query I written is below and its work for me
select ic.CategoryID, ic.Name CategoryName, ISC.SubCategoryID, ISC.Name SubCategoryName from Interface_Category IC
inner join Interface_SubCategory ISC
on ISC.CategoryID = ic.CategoryID
order by ic.CategoryID, isc.SubCategoryID