Postgresql Joining tables without losing records - sql

Let's say I have the following tables:
1 - StartingStock:
vendor | starting_stock
------------------------
adidas | 13
Reebok | 5
2 - Restock:
vendor | restocks
-----------------
adidas | 2
nike | 3
3 - Sales:
vendor | quantity_sold
----------------------
adidas | 10
nike | 1
I want my resulting table to be the sell through grouped by vendor. In this scenario, sell through is calculated like this: quantity_sold/(starting_stock + restocks). My only problem is that starting stock and restock tables may not have the some vendors that are present in the sales table. So in the scenario above, StartingStock does not have nike as a record. So if that's the case the sell though for nike would be just 1/3 or 1/(3+0). Therefore, my resulting table would be:
vendor | sell_through
---------------------
adidas | 1.5
nike | 0.33
Reebok | 0
So I'd want all of the vendors present in the result table (if it has no sales, value is 0 like Reebok shown above).
I tried working with the different types of joins but I couldn't get it. Any help would be great. Thanks.

We can try a full outer join approach here:
SELECT
COALESCE(ss.vendor, r.vendor, s.vendor) AS vendor,
COALESCE(s.quantity_sold, 0) /
(COALESCE(ss.starting_stock, 0) + COALESCE(r.restocks, 0)) AS sell_through
FROM StartingStock ss
FULL OUTER JOIN Restock r ON ss.vendor = r.vendor
FULL OUTER JOIN Sales s ON s.vendor = COALESCE(ss.vendor, r.vendor)
Demo
Note that I am coming up with 2/3 for the sell through for Adidas, since the quantity sold is 10, and the sum of stocks is 15.

I would use union all and aggregation:
select vendor,
sum(starting_stock), sum(restock), sum(quantity_sold),
(sum(quantity_sold) * 1.0 / sum(starting_stock) + sum(restock)) as sell_through
from ((select vendor, starting_stock, 0 as restock, 0 as quantity_sold
from startingstock
) union all
(select vendor, 0 as starting_stock, restock, 0 as quantity_sold
from restock
) union all
(select vendor, 0 as starting_stock, 0 as restock, quantity_sold
from sales
)
) v
group by vendor;
In particular, this version includes each number in the calculation only once. A JOIN approach will produce inaccurate results if a vendor has multiple rows in any of the tables.

Related

Count all of a column where value is 2 and sum this value with price

I'm doing with Northwind database where I use the Products table. I need to count all of the rows where Category_Id is 2 and sum the amount with the prices.
Here's the example of a table shortly:
Category_ID | Unit Price
1 | 2,90
2 | 3,70
3 | 4,90
2 | 1,90
5 | 0,90
2 | 2,90
There are 3 rows where category_Id is 2. How to sum this 3 with that rows Unit price?
3,70 + 1,90 + 2,90 = 8,50
So the answer I need is 8,50 but I have no idea how to get that amount with a SQL query.
Does someone know?
you can get the aggregated values for all Ids using
Select Categeory_Id, sum([Unit Price]) Total, count(*) Qty
from Products
group by Category_Id
or just a specific total such as
select sum([Unit Price]) total
from products
where category_Id=2

How to retrieve unpaid amount of each invoice for finance charges?

I am working on implementing finance charges, but don't know how to retrieve the unpaid amount of each invoice. I am not new to SQL, however this problem has me stumped. Forgive me if I just didn't search properly.
So, let us say I have an invoice table:
id | amount
----+--------------
1 | 50.00
2 | 50.00
3 | 50.00
4 | 50.00
5 | 50.00
And a payments table:
amount
--------------
50.00
25.00
The result set should be this:
invoice_id | unpaid_amount
------------+--------------
2 | 25.00
3 | 50.00
4 | 50.00
5 | 50.00
Of course, there is quite a bit more to add to implement finance charges, but I think I can get the rest.
Edit: Sorry, an oversight of mine. The id's are not related. Removed payment id column.
Edit 2: These are fictual numbers, the real life numbers will be anything, so no matches can be made on the amounts.
Edit 3: And here I have created a SQL Fiddle to show what I have so far, based on #GordonLinoff second answer. I would appreciate a cleaner approach than the kludgy SQL I concocted.
I think you want a join:
select i.id as invoice_id,
(i.amount - coalesce(p.amount, 0)) as net_amount
from invoice i left join
payment p
on i.id = p.id;
EDIT:
Or, you may want:
select i.*,
(case when sum(i.amount) over (order by i.id) < p.amount
then i.amount
else greatest(p.amount - sum(i.amount) over (order by i.id) + i.amount, 0)
end) as amount_paid
from invoice i cross join
(select sum(amount) as amount
from payment
) p;
Here is a db<>fiddle.
You can do by using LEFT JOIN and subtract amounts from both table. COALSECE is here to prevent you from getting NULL when there is no payment.
select
i.id as invoice_id
,i.amount - coalesce(p.amount,0) as amount
from invoice i
left join payments p
on p.id = i.id
If in your payments talbe can be more than one payment for an invoice you should group payments before joining it to invoice.
select
i.id as invoice_id
,i.amount - coalesce(p.amount,0) as amount
from invoice i
left join (select id, sum(amount) as amount from payments group by id) p
on p.id = i.id

SELECT with LEFT JOIN performing math operation twice?

The general idea of what I'm trying to do is this:
Select all planned prices for an order, then subtract from that total all actual prices on that order.
The planned price and actual price are on different tables. When I have a single planned price and a single actual price, this works fine. However, when I have multiple planned prices or multiple actual prices it is giving me odd results as if the algebra is happening multiple times.
Query:
SELECT PL.orderid, (SUM(PL.lineprice) - NVL(SUM(AC.lineprice),0)) AS
Difference FROM plans PL
LEFT JOIN actuals AC ON PL.orderid = AC.orderid
WHERE PL.customer IN (SELECT customer FROM ...)
GROUP BY PL.orderid
ORDER BY PL.orderid;
The results of the query:
Orderid Difference
X-1224 100
X-1226 80
X-1345 70000
X-1351 125000
X-1352 10000
Y-2403 190000
My Plan table looks like this:
Orderid Planned_Price
X-1224 100
X-1226 100
X-1345 105000
X-1351 100000
X-1352 10000
X-1352 50000
Y-2403 25000
Y-2403 100000
And my Actual table this:
Orderid Actual_Price
X-1226 20
X-1345 35000
X-1351 25000
X-1351 50000
X-1352 25000
Y-2403 25000
Y-2403 5000
So it seems to work when I have only a single row in each table, or a single row in plans and no rows in actuals i.e., X-1224, X-1226 and X-1345.
However the results are too high or too low when I have multiple rows, with the same OrderID, in either table i.e., all the rest
I'm stumped as to why this is the case. Any insights are appreciated.
edit: Results I'd like, taking Y-2403 as example: (25000 + 100000) - (25000 + 5000) = 95000. What I'm getting is double that at 190000.
Why is this the case?
Because that is how join works. If you have data like this:
a
1
1
2
2
And b:
b
1
1
1
2
Then the result of a join will have six "1"s and two "2"s.
Your question doesn't say what you want for results, but a typical approach is to aggregate before doing the joins.
EDIT:
You seem to want:
select p.orderid,
(p.lineprice - coalesce(lineprice, 0)) as Difference
from (select orderid, sum(lineprice) as lineprice
from plans p
group by orderid
) p left join
(select orderid, sum(lineprice) as lineprice
from actuals a
group by orderid
) a
on p.orderid = a.orderid
where p.customer in (SELECT customer FROM ...)
order by p.orderid;
I suppose you are looking to compare the summed_up_prices by order id of plan table with the summed_up prices by order id actual plan table.?
If so the following can be done to ensure there are no duplicates entries by order
select a.orderid
,NVL(max(b.summed_up),0) - sum(a.actual_price) as difference
from actual_table a
left join (select pt.orderid
,sum(pt.planned_price) as summed_up
from planned_table pt
group by pt.orderid
)b
on a.orderid=b.orderid
group by a.orderid
+---------+------------+
| ORDERID | DIFFERENCE |
+---------+------------+
| X-1226 | 80 |
| Y-2403 | 95000 |
| X-1351 | 25000 |
| X-1345 | 70000 |
| X-1352 | 35000 |
+---------+------------+
Here is the dbfiddle link with the data
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=3cacffd19b39ecaf7ad752dff262ac47

SQL: How do I count the number of clients that have already bought the same product?

I have a table like the one below. It is a record of daily featured products and the customers that purchased them (similar to a daily deal site). A given client can only purchase a product one time per feature, but they may purchase the same product if it is featured multiple times.
FeatureID | ClientID | FeatureDate | ProductID
1 1002 2011-05-01 500
1 2333 2011-05-01 500
1 4458 2011-05-01 500
2 8888 2011-05-10 700
2 2333 2011-05-10 700
2 1111 2011-05-10 700
3 1002 2011-05-20 500
3 4444 2011-05-20 500
4 4444 2011-05-30 500
4 2333 2011-05-30 500
4 1002 2011-05-30 500
I want to count by FeatureID the number of clients that purchased FeatureID X AND who purchased the same productID during a previous feature.
For the table above the expected result would be:
FeatureID | CountofReturningClients
1 0
2 0
3 1
4 3
Ideally I would like to do this with SQL, but am also open to doing some manipulation in Excel/PowerPivot. Thanks!!
If you join your table to itself, you can find the data you're looking for. Be careful, because this query can take a long time if the table has a lot of data and is not indexed well.
SELECT t_current.FEATUREID, COUNT(DISTINCT t_prior.CLIENTID)
FROM table_name t_current
LEFT JOIN table_name t_prior
ON t_current.FEATUREDATE > t_prior.FEATUREDATE
AND t_current.CLIENTID = t_prior.CLIENTID
AND t_current.PRODUCTID = t_prior.PRODUCTID
GROUP BY t_current.FEATUREID
"Per feature, count the clients who match for any earlier Features with the same product"
SELECT
Curr.FeatureID
COUNT(DISTINCT Prev.ClientID) AS CountofReturningClients --edit thanks to feedback
FROM
MyTable Curr
LEFT JOIN
MyTable Prev WHERE Curr.FeatureID > Prev.FeatureID
AND Curr.ClientID = Prev.ClientID
AND Curr.ProductID = Prev.ProductID
GROUP BY
Curr.FeatureID
Assumptions: You have a table called Features that is:
FeatureID, FeatureDate, ProductID
If not then you could always create one on the fly with a temporary table, cte or view.
Then:
SELECT
FeatureID
, (
SELECT COUNT(DISTINCT ClientID) FROM Purchases WHERE Purchases.FeatureDate < Feature.FeatureDate AND Feature.ProductID = Purchases.ProductID
) as CountOfReturningClients
FROM Features
ORDER BY FeatureID
New to this, but wouldn't the following work?
SELECT FeatureID, (CASE WHEN COUNT(clientid) > 1 THEN COUNT(clientid) ELSE 0 END)
FROM table
GROUP BY featureID

SQL summary by ID with period to period comparison

I am a beginner in SQL, hope someone can help me on this:
I have a Items Category Table:
ItemID | ItemName | ItemCategory | Active/Inactive
100 Carrot Veg Yes
101 Apple Fruit Yes
102 Beef Meat No
103 Pineapple Fruit Yes
And I have a sales table:
Date | ItemID | Sales
01/01/2010 100 50
05/01/2010 101 200
06/01/2010 101 250
06/01/2010 102 300
07/01/2010 103 50
08/01/2010 100 100
10/01/2010 102 250
How Can I achieve a sales summary table by Item By Period as below (with only active item)
ItemID | ItemName | ItemCategory | (01/01/2010 – 07/01/2010) | (08/01/2010 – 14/01/1020)
100 Carrot Veg 50 100
101 Apple Fruit 450 0
103 Pineapple Fruit 0 0
A very dirty solution
SELECT s.ItemId,
(SELECT ItemName FROM Items WHERE ItemId = s.ItemId) ItemName,
ISNULL((SELECT Sum(Sales)FROM sales
WHERE [Date] BETWEEN '2010/01/01' AND '2010/01/07'
AND itemid = s.itemid
GROUP BY ItemId),0) as firstdaterange,
ISNULL((SELECT Sum(Sales)FROM sales
WHERE [Date] BETWEEN '2010/01/08' AND '2010/01/14'
AND itemid = s.itemid
GROUP BY ItemId), 0) seconddaterange
FROM Sales s
INNER JOIN Items i ON s.ItemId = i.ItemId
WHERE i.IsActive = 'Yes'
GROUP BY s.ItemId
Again a dirty solution, also the dates are hardcoded. You can probably turn this into a stored procedure taking in the dates as parameters.
I'm not too clued up on PIVOT command but maybe that will be worth a google.
You can pivot the data using the SQL PIVOT operator. Unfortunately, that operator has limited scope due to the requirement to pre-specify the output columns.
You normally achieve this by grouping on a calculated column (in this case, one that computes the week number or first day of the week in which each row falls). You can then either generate SQL on-the-fly with columns derived using SELECT DISTINCT week FROM result, or just drop the result into Excel and use its pivot table facility.