Postgresql SUM with filter group column - sql

I have this relational data model
product_table
ID
type
configurable_product_id
1
CONFIGURABLE
null
2
SIMPLE
1
3
SIMPLE
1
4
SIMPLE
1
product_source
ID
product_id
quantity
1
2
50
2
3
20
3
4
10
A table that contains the list of products, with two types: configurable and simple. The simple products are linked to a configurable product.
A product_source table that contains the quantities of the different products.
I would like to make a query to retrieve all the quantities of the products, in this way:
If configurable product, then the quantity is the sum of the quantities of the simple products
If simple product, then it is the quantity of the simple product only.
Here is the expected result with the above data:
result
product_id
quantity
1
80
2
50
3
20
4
10
Do you have any idea how to proceed ?
For the moment, I have thought about this type of request, but I don't know how to complete the 'CONFIGURABLE' case
SELECT
pr.id,
pr.type,
pr.configurable_product_id,
CASE
WHEN (pr.type = 'SIMPLE') THEN ps.quantity
WHEN (pr.type = 'CONFIGURABLE') THEN (????)
END AS quantity
FROM public."product" as pr
LEFT JOIN public."product_source" as ps
ON ps.product_id = pr.id

You can use window function and partition for calculating sum of quantity (it was like group by)
demo
SELECT
pr.id AS product_id,
CASE
WHEN (pr.type = 'SIMPLE') THEN ps.quantity
WHEN (pr.type = 'CONFIGURABLE') THEN SUM(ps.quantity) over ()
END AS quantity
FROM public."product" as pr
LEFT JOIN public."product_source" as ps
ON ps.product_id = pr.id
ORDER BY pr.id

Your (...) is:
(
SELECT SUM(ps2.quantity)
FROM product as pr2
LEFT JOIN product_source as ps2
ON ps2.product_id = pr2.id
WHERE pr2.configurable_product_id = pr.id
)

Related

SQL make a list of not existing objects

I have following problem.
Two tables. Product Table and Inventory Table.
In Product table are a list of all possible products and in Inventory the current stock of each store
e.g.
Inventory Table
ProductID Stock StoreID
1 10 1
2 10 1
3 10 1
1 10 2
Product Table
ProductID Product
1 Bananas
2 Apples
3 Oranges
4 Kiwi
What I want is a list of products that are not in stock for the stores.
Following the example the desired result would be
Store ProductID Product
1 4 Kiwi
2 2 Apples
2 3 Oranges
2 4 Kiwi
Now I tried several approaches from left joins, not in and not exist but haven't found a solution.
E.g.
SELECT *
FROM Inventory t1
left join Product t2 ON t2.ProductID = t1.ProductID
WHERE t2.ProductID IS NULL
But this returns nothing
Any help please.
Thank you
Solution to your problem:
FOR MySQL as well as MSSQL
SELECT i.storeId,p.ProductId, p.product
FROM Product p
CROSS JOIN (
SELECT Distinct storeId
FROM Inventory) i
LEFT JOIN Inventory iv
ON p.productID = iv.productId AND i.storeId = iv.storeId
WHERE iv.storeId IS NULL;
OUTPUT:
storeId ProductId product
1 4 Kiwi
2 2 Apples
2 3 Oranges
2 4 Kiwi
Follow the link to the demo:
http://sqlfiddle.com/#!9/71a854/9
CROSS JOIN:
The CROSS JOIN produced a result set which is the product of rows of two associated tables when no WHERE clause is used with CROSS JOIN.
In this join, the result set appeared by multiplying each row of the first table with all rows in the second table if no condition introduced with CROSS JOIN.
This kind of result is called as Cartesian Product.
Below picture will give you a more clear picture:
Source:https://www.w3resource.com/mysql/advance-query-in-mysql/mysql-cross-join.php
Your approach should work, although selecting columns from only the first table makes sense. And the first table should be the product table:
SELECT p.*
FROM Product p LEFT JOIN
Inventory i
ON i.ProductID = p.ProductID
WHERE i.ProductID IS NULL;
Your version returns all inventory products that are not in the Product table. That should not be possible, if your data is correctly formatted.
Another way to write the query uses NOT EXISTS:
select p.*
from product p
where not exists (select 1 from inventory i where i.productid = p.productid);
This version (which should have very good performance if you have an index on inventory(productid)) comes closer to how you expressed the question.

Optimized query for a subquery in sql

I made a query to get the inventory of products as follows:
Select
b.ProductID, c.ProductName,
(Select
Case
When SUM(Qty) IS NULL
then 0
else SUM(Qty)
end
from
InvoiceDetails
where
ProductID = b.ProductID) as Sold,
(Select
Case
When SUM(QtyReceive) IS NULL
then 0
else SUM(QtyReceive)
end
from
PurchaseOrderDetails
where
ProductID = b.ProductID) as Stocks,
((Select
Case
When SUM(QtyReceive) IS NULL
then 0
else SUM(QtyReceive)
end
from
PurchaseOrderDetails
where
ProductID = b.ProductID) -
(Select
Case
When SUM(Qty) IS NULL
then 0
else SUM(Qty)
end
from
InvoiceDetails
where
ProductID = b.ProductID)) as RemainingStock
from
InvoiceDetails a
Right join
PurchaseOrderDetails b on a.ProductID = b.ProductID
Inner join
Products c on b.ProductID = c.ProductID
Group By
b.ProductID, c.ProductName
This query returns the data that I want, and it runs fine in my desktop, but when I deploy the application that runs this query on a lower specs laptop, it is really slow and causes the laptop to hang. I need some help on how to optimize the query or maybe change it to make it more efficient... thanks in advance
This are the data of my InvoiceDetails table
Data From my PurchaseOrderDetails table
Data from Products table
So I've taken out your subqueries in the select, I don't think these were necessary at all. I've also moved around your joins and given better aliases to the tables;
SELECT
b.ProductID,
c.ProductName,
ISNULL(SUM(id.Qty),0) as Sold,
ISNULL(SUM(pod.QtyReceive),0) as Stocks,
ISNULL(SUM(pod.QtyReceive),0) - ISNULL(SUM(id.Qty),0) as RemainingStock
FROM PurchaseOrderDetails pod
INNER JOIN Products pr
ON pr.ProductID = pod.ProductID
LEFT JOIN InvoiceDetails id
ON id.ProductID = pod.ProductID
GROUP BY
pod.ProductID, pr.ProductName
You were already joining those two tables so you don't need subqueries in the select at all. I've also wrapped the SUM in ISNULL to ensure there are no NULL errors.
I'd suggest using the SET STATISTICS TIME,IO ON at the beginning of your code (with an OFF command at the end). Then copy all of the text from your 'messages' tab into statisticsparser.com. Do this for both queries and compare, check the total CPU time and the logical reads, you want these both lower for better performance. I'm betting your logical reads will drop significantly with this new query.
EDIT
OK, I've put together a new query based upon your sample data. I've only used the fields that we actually need for this query so that it's simpler for this example.
Sample Data
CREATE TABLE #InvoiceDetails (ProductID int, Qty int)
INSERT INTO #InvoiceDetails (ProductID,Qty)
VALUES (3,50),(1,0),(2,1),(1,12),(2,1),(3,1),(1,1),(2,1),(1,1),(2,1)
CREATE TABLE #PurchaseOrderDetails (ProductID int, Qty int)
INSERT INTO #PurchaseOrderDetails (ProductID, Qty)
VALUES (1,100),(2,20),(4,10),(1,12),(5,12),(4,12),(3,12),(2,20),(3,20),(4,20),(5,20)
CREATE TABLE #Products (ProductID int, ProductName varchar(20))
INSERT INTO #Products (ProductID, ProductName)
VALUES (1,'Sample Product'),(2,'DYE INK CYAN'),(3,'test Product 1'),(4,'test Product 2'),(5,'test Product 3'),(1004,'TESTING PRODUCT')
For this, here is the output of your original query
ProductID ProductName Sold Stocks RemainingStock
1 Sample Product 14 112 98
2 DYE INK CYAN 4 40 36
3 test Product 1 51 32 -19
4 test Product 2 0 42 42
5 test Product 3 0 32 32
This is the re-written query that I've used. Note, there are no subqueries within the SELECT statement, they're within the joins as they should be. Also see that as we're aggregating in the subqueries we don't need to do this in the outer query too.
SELECT
pod.ProductID,
pr.ProductName,
ISNULL(id.Qty,0) as Sold,
ISNULL(pod.Qty,0) as Stocks,
ISNULL(pod.Qty,0) - ISNULL(id.Qty,0) as RemainingStock
FROM #Products pr
INNER JOIN (SELECT ProductID, SUM(Qty) Qty FROM #PurchaseOrderDetails GROUP BY ProductID) pod
ON pr.ProductID = pod.ProductID
LEFT JOIN (SELECT ProductID, SUM(Qty) Qty FROM #InvoiceDetails GROUP BY ProductID) id
ON id.ProductID = pr.ProductID
And this is the new output
ProductID ProductName Sold Stocks RemainingStock
1 Sample Product 14 112 98
2 DYE INK CYAN 4 40 36
3 test Product 1 51 32 -19
4 test Product 2 0 42 42
5 test Product 3 0 32 32
Which matches your original query.
I'd suggest trying this query on your machines and seeing which performs better, try the STATISTICS TIME,IO command I mentioned previously.
You grouped by b.ProductID, c.ProductName then you could use aggregate function to calculate.
And create indexes in your table to improve performance.
Select
b.ProductID, c.ProductName,
SUM(isnull(a.Qty,0)) as Sold,
SUM(b.QtyReceive) as Stocks,
SUM(b.QtyReceive) - SUM(isnull(a.Qty,0)) as RemainingStock
from
PurchaseOrderDetails b
LEFT JOIN InvoiceDetails a on a.ProductID = b.ProductID
INNER JOIN Products c on b.ProductID = c.ProductID
Group By
b.ProductID, c.ProductName
Can you try this? (I wrote without testing, as you didn't post sample data nor create table). Please check it and use as a starting point. Compare results from your query and this and compare execution plan. Analysis of performances requires "some" knowledge of Sql and ability to consider several things (eg. how many rows, are there indexes, using of execution plan and statistics, etc.)
SELECT C.PRODUCTID
,C.PRODUCTNAME
,COALESCE(D.QTY_SOLD,0) AS QTY_SOLD
,COALESCE(E.QTY_STOCKS,0) AS QTY_STOCKS
,COALESCE(E.QTY_STOCKS,0)-COALESCE(D.QTY_SOLD,0) AS REMAININGSTOCK
FROM PRODUCTS C
LEFT JOIN (SELECT PRODUCTID, SUM(QTY) AS QTY_SOLD
FROM INVOICEDETAILS
GROUP BY PRODUCTID
) D ON B.PRODUCTID = D.PRODUCTID
LEFT JOIN (SELECT PRODUCTID,SUM(QTYRECEIVE) AS QTY_STOCKS
FROM PURCHASEORDERDETAILS
GROUP BY PRODUCTID
) E ON B.PRODUCTID = E.PRODUCTID
Looking to your query, I think this could be equivalent (or at least I hope it is):
Select
b.ProductID
, c.ProductName
, Case When SUM(a.Qty) IS NULL then 0 else SUM(a.Qty) end as sold
, Case When SUM(b.QtyReceive) IS NULL then 0 else SUM(b.QtyReceive) end as Stock
, Case When SUM(isnull(a.Qty,0 ) - isnull(b.QtyReceive,0)) IS NULL
then 0
else SUM(isnull(a.Qty,0 ) - isnull(b.QtyReceive,0)) end as RemainingStock
from Products c
left join InvoiceDetails a on c.ProductID = a.ProductID
left join PurchaseOrderDetails b on c.ProductID = b.ProductID
Group By b.ProductID,c.ProductName

SUM of multiple products

I'm really struggling with this one. SO I have 2 tables:
Products
PendingCartItems
Here's a screenshot of structure for both tables:
I need to get the SUM for all 3 products WHERE pending_cart_id = 18.
SELECT SUM(price) as TotalCartPrice FROM products WHERE id = '274'
How can I write it so it sums all 3 id's (274+251+49)?
Would something like this not work?
Select sum(b.price*a.quantity)
from pending_cart_items a
join products b
on a.product_id=b.id
where a.pending_cart_id =18
Edit: Just realized I'd omitted the quantity from the cart computation :)
If the model is relational, you can try this
SELECT
SUM(price) as TotalCartPrice
FROM products
INNER JOIN PendingCartItems ON products.id = PendingCartItems.product_id
WHERE PendingCartItems.pending_cart_id = 18
GROUP BY PendingCartItems.pending_cart_id
You'll need to join the two tables:-
select
sum(p.price)
from Products p
inner join PendingCartItems pci on p.id= pci.product_id
where pci.pending_cart_id = 18

Rails/SQL: finding invoices by checking two sums

I have an Invoice model that has_many lines and has_many payments.
Invoice:
id
ref
Line:
invoice_id:
total (decimal)
Payment:
invoice_id:
total(decimal)
I need to find all paid invoices. So I'm doing the following:
Invoice.joins(:lines, :payments).having(' sum(lines.total) = sum(payments.total').group('invoices.id')
Which queries:
SELECT *
FROM "invoices"
INNER JOIN "lines" ON "lines"."invoice_id" = "invoices"."id"
INNER JOIN "payments" ON "payments"."invoice_id" = "invoices"."id"
GROUP BY invoices.id
HAVING sum(lines.total) = sum(payments.total)
But it always return empty array even if there are invoices fully paid.
Is something wrong with my code?
If you join to more than one table with a 1:n relationship, the joined rows can multiply each other.
This related answer has more detailed explanation for the problem:
Two SQL LEFT JOINS produce incorrect result
To avoid that, sum the totals before you join. This way you join to exactly 1 (or 0) rows, and nothing is multiplied. Not only correct, also considerably faster.
SELECT i.*, l.sum_total
FROM invoices i
JOIN (
SELECT invoice_id, sum(total) AS sum_total
FROM lines
GROUP BY 1
) l ON l.invoice_id = i.id
JOIN (
SELECT invoice_id, sum(total) AS sum_total
FROM payments
GROUP BY 1
) p ON p.invoice_id = i.id
WHERE l.sum_total = p.sum_total;
Using [INNER] JOIN, not LEFT [OUTER] JOIN on purpose. Invoices that do not have any lines or payments are not of interest to begin with. Since we want "paid" invoices. For lack of definition and by the looks of the provided query, I am assuming that means invoices with actual lines and payments, both totaling the same.
If one invoice have a line and two payments fully paid like this:
lines:
id total invoice_id
1 30 1
payments:
id total invoice_id
1 10 1
2 20 1
Then join lines and payments to invoice with invoce_id will get 2 rows like this:
payment_id payment_total line_id line_total invoice_id
1 10 1 30 1
2 20 1 30 1
So the sum of line_total will not equal to sum of payment_total.
To get all paid invoice could use exists instead of joins:
Invoice.where(
"exists
(select 1 from
(select invoice_id
from (select invoice_id,sum(total) as line_total
from lines
group by invoice_id) as l
inner join (select invoice_id,sum(total) as payment_total
from payments
group by invoice_id) as p
on l.invoice_id = p.invoice_id
where payment_total = line_total) as paid
where invoices.id = paid.id) ")
The sub_query paid will get all paid invoice_ids.

Selecting records in SQL that have the minimum value for that record based on another field

I have a set of data, and while the number of fields and tables it joins with is quite complex, I believe I can distill my problem down using the required fields/tables here for illustration regarding this particular problem.
I have three tables: ClientData, Sources, Prices
Here is what my current query looks like before selecting the minimum value:
select c.RecordID, c.Description, s.Source, p.Price, p.Type, p.Weight
from ClientData c
inner join Sources s ON c.RecordID = s.RecordID
inner join Prices p ON s.SourceID = p.SourceID
This produces the following result:
RecordID Description Source Price Type Weight
=============================================================
001002003 ABC Common Stock Vendor 1 104.5 Close 1
001002003 ABC Common Stock Vendor 1 103 Bid 2
001002003 ABC Common Stock Vendor 2 106 Close 1
001002003 ABC Common Stock Vendor 2 100 Unknwn 0
111222333 DEF Preferred Stk Vendor 3 80 Bid 2
111222333 DEF Preferred Stk Vendor 3 82 Mid 3
111222333 DEF Preferred Stk Vendor 2 81 Ask 4
What I am trying to do is display prices that belong to the same record which have the minimum non-zero weight for that record (so the weight must be greater than 0, but it has to be the minimum from amongst the remaining weights). So in the above example, for record 001002003 I would want to show the close prices from Vendor 1 and Vendor 2 because they both have a weight of 1 (the minimum weight for that record). But for 111222333 I would want to show just the bid price from Vendor 3 because its weight of 2 is the minimum, non-zero for that record. The result that I'm after would like like:
RecordID Description Source Price Type Weight
=============================================================
001002003 ABC Common Stock Vendor 1 104.5 Close 1
001002003 ABC Common Stock Vendor 2 106 Close 1
111222333 DEF Preferred Stk Vendor 3 80 Bid 2
Any ideas on how to achieve this?
EDIT: This is for SQL Server Compact Edition.
I was able to come up with the solution so I thought I would share it:
SELECT x.RecordID, VendorSource, VendorPrice
FROM ClientData x
INNER JOIN Sources s ON x.RecordID = s.RecordID
INNER JOIN Prices p ON s.SourceID = p.SourceID
INNER JOIN (SELECT c.RecordID, MIN(Weight) min_weight
FROM ClientData c
INNER JOIN Sources s ON c.RecordID = s.RecordID
INNER JOIN Prices p ON s.SourceID = p.SourceID
WHERE Weight != 0
GROUP BY c.RecordID) w ON x.RecordID = w.RecordID
WHERE p.Weight = w.min_weight
This allows the minimum weight to be populated on a RecordID level in the derived table, so there is 1 weight per RecordID.
For all those who gave answers, thank you; I appreciate the help and any guidance that was offered.
You can use RANK() with a Partition over RecordId with increasing weights to 'rate' each row (after excluding zero weights entirely), and then simply filter out the top ranked rows. The CTE used just to keep the second query simple + clear
;WITH MyRecords AS
(
-- Your source query goes here
select c.RecordID, c.Description, s.Source, p.Price, p.Type, p.Weight
from ClientData c
inner join Sources s ON c.RecordID = s.RecordID
inner join Prices p ON s.SourceID = p.SourceID
)
SELECT RecordID, [Description], [Source], [Price], [Type], [Weight]
FROM
(
SELECT RecordID, [Description], [Source], [Price], [Type], [Weight],
-- With ranking, the lower the weight the better
Rnk = RANK() OVER (PARTITION BY RecordId ORDER BY [Weight] ASC)
FROM MyRecords
-- But exclude Weight 0 entirely
WHERE [Weight] > 0
) RankedRecords
-- We just want the top ranked records, with ties
WHERE Rnk = 1
Edit CE constraint added after the post. See How would I duplicate the Rank function in a Sql Server Compact Edition SELECT statement? on how to simulate RANK() over in CE.
I think you need to change your structure up a little bit to actually make this work as you would like it to. Basically the way you have it a price record is set up against a Source rather than against the Item which seems to be in the ClientData table. By removing the c.Record number column from the Sources table and putting it into the Prices table you should get the correct One(ClientData) to many (Prices), and One(ClientData) to many(Sources) relationships that I think you need.
select c.RecordID, c.Description, s.Source, p.Price, p.Type, p.Weight
from ClientData c
inner join Prices p ON c.RecordID = p.RecordID
inner join Sources s ON s.SourceID = p.SourceID
AND p.Weight> 0
LEFT OUTER JOIN #Prices p2 ON c.RecordID = p2.RecordID
AND p2.PriceID <> p.priceID
AND p2.Weight > 0
AND p2.Weight < p.Weight
WHERE p2.SourceID IS NULL
If you make the change specified above then this query will returns the exact data that you are looking for.