Calculate total amount PGSQL - sql

query which calculates the total amount in dollars of stolen goods for each month for restricted and neutral items.
I have 2 tables
first
| UPC | item | in_stock | price | ship_day | class |
1 | 101 | 'generator' | 16 | 5999 | '12-1-2065'| 'restricted'
2 | 102 | 'blank tape' | 30 | 3000 | '12-1-2065'| 'neutral'
second
| UPC | unit_stolen |
1 | 101 | 4 |
1 | 401 | 2 |

If I understand correctly, this is basically a join and group by:
select date_trunc('mon', f.ship_day) as yyyymm,
sum(f.price * s.unit_stolen) filter (where f.class = 'restricted'),
sum(f.price * s.unit_stolen) filter (where f.class = 'neutral')
from first f join
second s
on f.upc = s.upc
group by date_trunc('mon', f.ship_day)

Related

Complex nested aggregations to get order totals

I have a system to track orders and related expenditures. This is a Rails app running on PostgreSQL. 99% of my app gets by with plain old Rails Active Record call etc. This one is ugly.
The expenditures table look like this:
+----+----------+-----------+------------------------+
| id | category | parent_id | note |
+----+----------+-----------+------------------------+
| 1 | order | nil | order with no invoices |
+----+----------+-----------+------------------------+
| 2 | order | nil | order with invoices |
+----+----------+-----------+------------------------+
| 3 | invoice | 2 | invoice for order 2 |
+----+----------+-----------+------------------------+
| 4 | invoice | 2 | invoice for order 2 |
+----+----------+-----------+------------------------+
Each expenditure has many expenditure_items and can the orders can be parents to the invoices. That table looks like this:
+----+----------------+-------------+-------+---------+
| id | expenditure_id | cbs_item_id | total | note |
+----+----------------+-------------+-------+---------+
| 1 | 1 | 1 | 5 | Fuit |
+----+----------------+-------------+-------+---------+
| 2 | 1 | 2 | 15 | Veggies |
+----+----------------+-------------+-------+---------+
| 3 | 2 | 1 | 123 | Fuit |
+----+----------------+-------------+-------+---------+
| 4 | 2 | 2 | 456 | Veggies |
+----+----------------+-------------+-------+---------+
| 5 | 3 | 1 | 34 | Fuit |
+----+----------------+-------------+-------+---------+
| 6 | 3 | 2 | 76 | Veggies |
+----+----------------+-------------+-------+---------+
| 7 | 4 | 1 | 26 | Fuit |
+----+----------------+-------------+-------+---------+
| 8 | 4 | 2 | 98 | Veggies |
+----+----------------+-------------+-------+---------+
I need to track a few things:
amounts left to be invoiced on orders (thats easy)
above but rolled up for each cbs_item_id (this is the ugly part)
The cbs_item_id is basically an accounting code to categorize the money spent etc. I have visualized what my end result would look like:
+-------------+----------------+-------------+---------------------------+-----------+
| cbs_item_id | expenditure_id | order_total | invoice_total | remaining |
+-------------+----------------+-------------+---------------------------+-----------+
| 1 | 1 | 5 | 0 | 5 |
+-------------+----------------+-------------+---------------------------+-----------+
| 1 | 2 | 123 | 60 | 63 |
+-------------+----------------+-------------+---------------------------+-----------+
| | | | Rollup for cbs_item_id: 1 | 68 |
+-------------+----------------+-------------+---------------------------+-----------+
| 2 | 1 | 15 | 0 | 15 |
+-------------+----------------+-------------+---------------------------+-----------+
| 2 | 2 | 456 | 174 | 282 |
+-------------+----------------+-------------+---------------------------+-----------+
| | | | Rollup for cbs_item_id: 2 | 297 |
+-------------+----------------+-------------+---------------------------+-----------+
order_total is the sum of total for all the expenditure_items of the given order ( category = 'order'). invoice_total is the sum of total for all the expenditure_items with parent_id = expenditures.id. Remaining is calculated as the difference (but not greater than 0). In real terms the idea here is you place and order for $1000 and $750 of invoices come in. I need to calculate that $250 left on the order (remaining) - broken down into each category (cbs_item_id). Then I need the roll-up of all the remaining values grouped by the cbs_item_id.
So for each cbs_item_id I need group by each order, find the total for the order, find the total invoiced against the order then subtract the two (also can't be negative). It has to be on a per order basis - the overall aggregate difference will not return the expected results.
In the end looking for a result something like this:
+-------------+-----------+
| cbs_item_id | remaining |
+-------------+-----------+
| 1 | 68 |
+-------------+-----------+
| 2 | 297 |
+-------------+-----------+
I am guessing this might be a combination of GROUP BY and perhaps a sub query or even CTE (voodoo to me). My SQL skills are not that great and this is WAY above my pay grade.
Here is a fiddle for the data above:
http://sqlfiddle.com/#!17/2fe3a
Alternate fiddle:
https://dbfiddle.uk/?rdbms=postgres_11&fiddle=e9528042874206477efbe0f0e86326fb
This query produces the result you are looking for:
SELECT cbs_item_id, sum(order_total - invoice_total) AS remaining
FROM (
SELECT cbs_item_id
, COALESCE(e.parent_id, e.id) AS expenditure_id -- ①
, COALESCE(sum(total) FILTER (WHERE e.category = 'order' ), 0) AS order_total -- ②
, COALESCE(sum(total) FILTER (WHERE e.category = 'invoice'), 0) AS invoice_total
FROM expenditures e
JOIN expenditure_items i ON i.expenditure_id = e.id
GROUP BY 1, 2 -- ③
) sub
GROUP BY 1
ORDER BY 1;
db<>fiddle here
① Note how I assume a saner table definition with expenditures.parent_id being integer, and true NULL instead of the string 'nil'. This allows the simple use of COALESCE.
② About the aggregate FILTER clause:
Aggregate columns with additional (distinct) filters
③ Using short syntax with ordinal numbers of an SELECT list items. Example:
Select first row in each GROUP BY group?
can I get the total of all the remaining for all rows or do I need to wrap that into another sub select?
There is a very concise option with GROUPING SETS:
...
GROUP BY GROUPING SETS ((1), ()) -- that's all :)
db<>fiddle here
Related:
Converting rows to columns

Duplicate records upon joining table

I am still very new to SQL and Tableau however I am trying to work myself towards achieving a personal project of mine.
Table A; shows a table which contains the defect quantity per product category and when it was raised
+--------+-------------+--------------+-----------------+
| Issue# | Date_Raised | Category_ID# | Defect_Quantity |
+--------+-------------+--------------+-----------------+
| PCR12 | 11-Jan-2019 | Product#1 | 14 |
| PCR13 | 12-Jan-2019 | Product#1 | 54 |
| PCR14 | 5-Feb-2019 | Product#1 | 5 |
| PCR15 | 5-Feb-2019 | Product#2 | 7 |
| PCR16 | 20-Mar-2019 | Product#1 | 76 |
| PCR17 | 22-Mar-2019 | Product#2 | 5 |
| PCR18 | 25-Mar-2019 | Product#1 | 89 |
+--------+-------------+--------------+-----------------+
Table B; shows the consumption quantity of each product by month
+-------------+--------------+-------------------+
| Date_Raised | Category_ID# | Consumed_Quantity |
+-------------+--------------+-------------------+
| 5-Jan-2019 | Product#1 | 100 |
| 17-Jan-2019 | Product#1 | 200 |
| 5-Feb-2019 | Product#1 | 100 |
| 8-Feb-2019 | Product#2 | 50 |
| 10-Mar-2019 | Product#1 | 100 |
| 12-Mar-2019 | Product#2 | 50 |
+-------------+--------------+-------------------+
END RESULT
I would like to create a table/bar chart in tableau that shows that Defect_Quantity/Consumed_Quantity per month, per Category_ID#, so something like this below;
+----------+-----------+-----------+
| Month | Product#1 | Product#2 |
+----------+-----------+-----------+
| Jan-2019 | 23% | |
| Feb-2019 | 5% | 14% |
| Mar-2019 | 89% | 10% |
+----------+-----------+-----------+
WHAT I HAVE TRIED SO FAR
Unfortunately i have not really done anything, i am struggling to understand how do i get rid of the duplicates upon joining the tables based on Category_ID#.
Appreciate all the help I can receive here.
I can think of doing left joins on both product1 and 2.
select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
, (p2.product1 - sum(case when category_id='Product#1' then Defect_Quantity else 0 end))/p2.product1 * 100
, (p2.product2 - sum(case when category_id='Product#2' then Defect_Quantity else 0 end))/p2.product2 * 100
from tableA t1
left join
(select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
, sum(Comsumed_Quantity) as product1 tableB
where category_id = 'Product#1'
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p1
on p1.Date_Raised = t1.Date_Raised
left join
(select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
, sum(Comsumed_Quantity) as product2 tableB
where category_id = 'Product#2'
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p2
on p2.Date_Raised = t1.Date_Raised
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
By using ROW_NUMBER() OVER (PARTITION BY ORDER BY ) as RN, you can remove duplicate rows. As of your end result you should extract month from date and use pivot to achieve.
I would do this as:
select to_char(date_raised, 'YYYY-MM'),
(sum(case when product = 'Product#1' then defect_quantity end) /
sum(case when product = 'Product#1' then consumed_quantity end)
) as product1,
(sum(case when product = 'Product#2' then defect_quantity end) /
sum(case when product = 'Product#2' then consumed_quantity end)
) as product2
from ((select date_raised, product, defect_quantity, 0 as consumed_quantity
from a
) union all
(select date_raised, product, 0 as defect_quantity, consumed_quantity
from b
)
) ab
group by to_char(date_raised, 'YYYY-MM')
order by min(date_raised);
(I changed the date format because I much prefer YYYY-MM, but that is irrelevant to the logic.)
Why do I prefer this method? This will include all months where there is a row in either table. I don't have to worry that some months are inadvertently filtered out, because there are missing production or defects in one month.

Want to JOIN fourth table in query

I have four tables:
mls_category
points_matrix
mls_entry
bonus_points
My first table (mls_category) is like below:
*--------------------------------*
| cat_no | store_id | cat_value |
*--------------------------------*
| 10 | 101 | 1 |
| 11 | 101 | 4 |
*--------------------------------*
My second table (points_matrix) is like below:
*----------------------------------------------------*
| pm_no | store_id | value_per_point | maxpoint |
*----------------------------------------------------*
| 1 | 101 | 1 | 10 |
| 2 | 101 | 2 | 50 |
| 3 | 101 | 3 | 80 |
*----------------------------------------------------*
My third table (mls_entry) is like below:
*-------------------------------------------*
| user_id | category | distance | status |
*-------------------------------------------*
| 1 | 10 | 20 | approved |
| 1 | 10 | 30 | approved |
| 1 | 11 | 40 | approved |
*-------------------------------------------*
My fourth table (bonus_points) is like below:
*--------------------------------------------*
| user_id | store_id | bonus_points | type |
*--------------------------------------------*
| 1 | 101 | 200 | fixed |
| 2 | 102 | 300 | fixed |
| 1 | 103 | 4 | per |
*--------------------------------------------*
Now, I want to add bonus points value into the sum of total distance according to the store_id, user_id and type.
I am using the following code to get total distance:
SELECT MIN(b.value_per_point) * d.total_distance FROM points_matrix b
JOIN
(
SELECT store_id, sum(t1.totald/c.cat_value) as total_distance FROM mls_category c
JOIN
(
SELECT SUM(distance) totald, user_id, category FROM mls_entry
WHERE user_id= 1 AND status = 'approved' GROUP BY user_id, category
) t1 ON c.cat_no = t1.category
) d ON b.store_id = d.store_id AND b.maxpoint >= d.total_distance
The above code is correct to calculate value, now I want to JOIN my fourth table.
This gives me sum (60*3 = 180) as total value. Now, I want (60+200)*3 = 780 for user 1 and store id 101 and value is fixed.
i think your query will be like below
SELECT Max(b.value_per_point)*( max(d.total_distance)+max(bonus_points)) FROM mls_point_matrix b
JOIN
(
SELECT store_id, sum(t1.totald/c.cat_value) as total_distance FROM mls_category c
JOIN
(
SELECT SUM(distance) totald, user_id, category FROM mls_entry
WHERE user_id= 1 AND status = 'approved' GROUP BY user_id, category
) t1 ON c.cat_no = t1.category group by store_id
) d ON b.store_id = d.store_id inner join bonus_points bp on bp.store_id=d.store_id
DEMO fiddle

Using CASE in WHERE clause

I have 2 tables that needs to be joined based on couple of parameters. One of the parameters is year. One table contains the current year but another year doesn't contain current year, so it must use the latest year and matched with other parameters.
Example
Product
-------------------------------------------------------------------------------
| product_id | category_id | sub_category_id | product_year | amount |
-------------------------------------------------------------------------------
| 504 | I | U | 2020 | 400 |
| 510 | I | U | 2019 | 100 |
| 528 | I | U | 2019 | 150 |
| 540 | I | U | 2018 | 1000 |
Discount
-----------------------------------------------------------------------------
| discount_year | category_id | sub_category_id | discount |
-----------------------------------------------------------------------------
| 2018 | I | U | 0.15 |
| 2017 | I | U | 0.35 |
| 2016 | I | U | 0.50 |
Output
-----------------------------------------------------------------------------
| product_id | category_id | sub_category_id | product_year | discount_year |
-----------------------------------------------------------------------------
| 504 | I | U | 2020 | 2018 |
| 510 | I | U | 2019 | 2018 |
| 528 | I | U | 2019 | 2018 |
| 540 | I | U | 2018 | 2017 |
The discount is always gotten from one year behind but if those rates aren't available, then it would keep going back a year until available.
I have tried the following:
SELECT
product_year, a.product_id, a.category_id, a.sub_category_id,
discount_year, amount, discount
FROM
Product a
INNER JOIN
Discount b ON a.category_id = b.category_id
AND a.sub_category_id = b.sub_category_id
AND product_ year = CASE
WHEN discount_year + 1 = product_year
THEN discount_year + 1
WHEN discount_year + 2 = product_year
THEN discount_year + 2
WHEN discount_year + 3 = product_year
THEN discount_year + 3
END
WHERE
product = 540
This return the following output:
--------------------------------------------------------------------------------------------------
| product_year | product_id | category_id | sub_category_id | discount_year | amount | discount |
--------------------------------------------------------------------------------------------------
| 2016 | 540 | I | U | 2017 | 1000 | 0.50 |
| 2017 | 540 | I | U | 2017 | 1000 | 0.35 |
Any help will be appreciated.
You can use OUTER APPLY and a subquery. In the subquery select the row with the maximum discount_year, that is less the product_year using TOP and ORDER BY.
SELECT p.product_year,
p.product_id,
p.category_id,
p.sub_category_id,
d.discount_year,
p.amount,
d.discount
FROM product p
OUTER APPLY (SELECT TOP 1
*
FROM discount d
WHERE d.category_id = p.category_id
AND d.sub_category_id = p.sub_category_id
AND d.discount_year < p.product_year
ORDER BY d.discount_year DESC) d;
instead of a CASE expression you can use a sub-query to select the TOP 1 related
discount_year that is less than your product_year, ORDER BY discount_year ASC.
Create a product to discount mapping using a CTE first. This contains the discount year pulled from discount table for every product year in the product table and corresponding product_id. Following this, you can easily join with relevant tables to get results and eliminate any nulls as needed
Simplified query.
;WITH disc_prod_mapper
AS
(
SELECT product_id, product_year,(SELECT MAX(discount_year) FROM #Discount b WHERE discount_year < product_year AND a.category_id = b.category_id AND a.sub_category_id = b.sub_category_id ) AS discount_year
FROM Product a
)
SELECT a.product_year, c.discount_year, a.amount, c.discount
FROM Product a
LEFT JOIN disc_prod_mapper b ON a.product_id = b.product_id
LEFT JOIN Discount c ON b.discount_year = c.discount_year

Joining 3 tables on a date criteria

I need help with this.
Tbl: WarehouseInventory
Date | DelRec | ProductId | Quantity
2015-09-10 | 110 | 1 | 100
2015-09-12 | 111 | 1 | 100
2015-09-12 | 111 | 2 | 200
2015-09-12 | 111 | 3 | 300
Tbl: Withdrawals
Date | ID | ProductId | Quantity | CustomerId
2015-09-11 | 1 | 1 | 400 | 2
2015-09-12 | 1 | 1 | 100 | 1
2015-09-12 | 2 | 2 | 200 | 1
2015-09-12 | 3 | 3 | 300 | 1
Tbl: Customers
Customer Id | Name
1 | Somebody
2 | Someone
The output should be like this
DelRec | Date Added | ProductId | Stocked | Withdrawn | Customer
110 | 2015-09-10 | 1 | 100 | 0 | NULL
0 | 2015-09-11 | 1 | 0 | 400 | Someone
111 | 2015-09-12 | 1 | 100 | 100 | Somebody
111 | 2015-09-12 | 2 | 200 | 200 | Somebody
111 | 2015-09-12 | 3 | 300 | 300 | Somebody
This is what I have come up so far and it's giving me a wrong output
select wi.DateAdded as 'Date Added', max(wi.DeliveryReceipt) as 'Delivery Receipt', wi.ProductId as 'Product',
max(isnull(wi.Quantity, 0)) as 'Stocked', max(isnull(w.Quantity, 0)) as 'Withdrawn', e.Customers as 'Customer'
from WarehouseInventory wi
cross join Withdrawals w
cross join Customer e
group by wi.DateAdded, wi.ProductId, e.Customers, wi.DeliveryReceipt, w.ProductId
Basically, I need to join the two tables on the date and product and if there is a null value in one of the tables, just make it 0. I appreciate your help.
You can use a FULL OUTER JOIN:
SELECT DelRec,
COALESCE(wi.[Date], wd.[Date]) AS Date_Added,
COALESCE(wi.ProductId, wd.ProductId) AS ProductId,
COALESCE(wi.Quantity, 0) AS Stocked,
COALESCE(wd.Quantity, 0) AS Withdrawn,
c.Name AS Customer
FROM WarehouseInventory AS wi
FULL OUTER JOIN Withdrawals AS wd
ON wi.[Date] = wd.[Date] AND wi.ProductId = wd.ProductId
LEFT JOIN Customers AS c ON c.[Customer Id] = wd.CustomerId
ORDER BY Date_Added
You have a few inconsistencies between your example table and your query, but here's the basic gist:
You want to FULL OUTER JOIN your Warehouse delivery (A) and Withdrawal (B) tables on both product and date
Make sure you coalesce(A, B) for both date and product
Sum the quantities from each table, then coalesce outside each aggregate to get zeros (since one column can be all nulls).
Here:
select
coalesce(wi.DateAdded, w.date) as 'Date Added',
max(wi.DeliveryReceipt) as 'Delivery Receipt',
coalesce(wi.ProductId, w.productId) as 'Product',
coalesce(sum(wi.Quantity), 0) as 'Stocked',
coalesce(sum(w.Quantity), 0) as 'Withdrawn',
e.name as 'Customer'
from WarehouseInventory wi
full outer join Withdrawals w on w.date = wi.dateadded and w.productId = wi.productId
left join Customer e on e.customerId = w.customerId
group by
coalesce(wi.DateAdded, w.date),
coalesce(wi.ProductId, w.productId),
e.name