How can I sum values based on a date range and group by the MAX date? - sql

I've got a data set like the following - Quantities and Sales $ aggregated by week and product
Week Product Quantity Sales
---- ------- -------- -----
1 12a 6 600
2 12a 4 400
3 12a 3 300
4 12a 1 100
5 12a 3 300
6 12a 1 100
7 12a 4 400
8 12a 6 600
9 12a 2 200
For every week, I need to sum quantity and sales for that week plus the previous 3 weeks
Desired result would be:
Week Product Quantity Sales
---- ------- -------- -----
1 12a 14 1400 --> Week 1 + Week 2 + Week 3 + Week 4 but row labeled Week 1
2 12a 11 1100
I feel like I need a loop to evaluate each week

Use window functions:
select t.*,
sum(quantity) over (partition by product
order by week
rows between current row and 3 following
) as quantity,
sum(sales) over (partition by product
order by week
rows between current row and 3 following
) as sales
from t;

Related

SQL Running total use case

I have a below dataframe from which i need the dates when sum of the qty for that particular id hits x% from the total sum then i need to populate that date against the id. Can someone please help me with the sql query for this
Table A
ID QTY
1 10
2 20
3 30
4 40
---
Table B
ID DATE qty
1 01-01-2020 1
1 01-02-2020 2
1 01-03-2020 4
1 01-04-2020 3
The expected output is for ID1 it is 01-03-2020 date the sum of qty exceed 60% (1+2+4 > 60% of total qty 10) of the QTY in table A
Expected output
ID date_where_qty_>60%
1 01-03-2020

How to calculate tiered pricing using PostgreSQL

I'm trying to calculate tiered rates for a stay at some lodging. Lets say we have a weekly, half week, and daily rate for a property.
period_name | nights | rate
-------------------------------------
WEEK | 7 | 100
HALFWEEK | 3 | 50
DAY | 1 | 25
How would I query this with a total number of nights and get a break down of what periods qualify, going from longest to shortest? Some examples results
10 nights
We break 10 into (7 days) + (3 days). The 7 days will be at the WEEK rate (100). The 3 days will be at the HALFWEEK rate (50). Here it qualifies for (1 WEEK # 100) + (1 HALFWEEK # 50)
period_name | nights | rate | num | subtotal
----------------------------------------------
WEEK | 7 | 100 | 1 | 100
HALFWEEK | 3 | 50 | 1 | 50
4 nights
We break 4 into (3 days) + (1 day). The 3 days will be at the HALFWEEK rate (50). The 1 day will be at the DAY rate (25). Here it qualifies for (1 HALFWEEK # 50) + (1 DAY # 25)
period_name | nights | rate | num | subtotal
----------------------------------------------
HALFWEEK | 3 | 50 | 1 | 50
DAY | 1 | 25 | 1 | 25
16 nights
We break 16 into (14 days) + (2 days). The 14 days will be at the WEEK rate (multiplied by 2), (100 * 2). The 2 days will be at the DAY rate (2 x 25). Here it qualifies for (2 WEEK # 100) + (2 DAY # 25)
period_name | nights | rate | num | subtotal
----------------------------------------------
WEEK | 7 | 100 | 2 | 200
DAY | 1 | 25 | 2 | 50
I thought about using the lag window function, but now sure how I'd keep track of the days already applied by the previous period.
You can do this with a CTE RECURSIVE query.
http://sqlfiddle.com/#!17/0ac709/1
Tier table (which can be dynamically expanded):
id name days rate
-- --------- ---- ----
1 WEEK 7 100
2 DAYS 1 25
3 HALF_WEEK 3 50
4 MONTH 30 200
Days data:
id num
-- ---
1 10
2 31
3 30
4 19
5 14
6 108
7 3
8 5
9 1
10 2
11 7
Result:
num_id num days total_price
------ --- ----------------------------------------------- -----------
1 10 {"MONTH: 0","WEEK: 1","HALF_WEEK: 1","DAYS: 0"} 150
2 31 {"MONTH: 1","WEEK: 0","HALF_WEEK: 0","DAYS: 1"} 225
3 30 {"MONTH: 1","WEEK: 0","HALF_WEEK: 0","DAYS: 0"} 200
4 19 {"MONTH: 0","WEEK: 2","HALF_WEEK: 1","DAYS: 2"} 300
5 14 {"MONTH: 0","WEEK: 2","HALF_WEEK: 0","DAYS: 0"} 200
6 108 {"MONTH: 3","WEEK: 2","HALF_WEEK: 1","DAYS: 1"} 875
7 3 {"MONTH: 0","WEEK: 0","HALF_WEEK: 1","DAYS: 0"} 50
8 5 {"MONTH: 0","WEEK: 0","HALF_WEEK: 1","DAYS: 2"} 100
9 1 {"MONTH: 0","WEEK: 0","HALF_WEEK: 0","DAYS: 1"} 25
10 2 {"MONTH: 0","WEEK: 0","HALF_WEEK: 0","DAYS: 2"} 50
11 7 {"MONTH: 0","WEEK: 1","HALF_WEEK: 0","DAYS: 0"} 100
The idea:
First I took this query to calculate your result for one value (19):
SELECT
days / 7 as WEEKS,
days % 7 / 3 as HALF_WEEKS,
days % 7 % 3 / 1 as DAYS
FROM
(SELECT 19 as days) s
Here you can see the recursive structure for the module operation terminated by an integer division. Because a more generic version should be necessary I thought about a recursive version. With PostgreSQL WITH RECURSIVE clause this is possible
https://www.postgresql.org/docs/current/static/queries-with.html
So thats the final query
WITH RECURSIVE days_per_tier(row_no, name, days, rate, counts, mods, num_id, num) AS (
SELECT
row_no,
name,
days,
rate,
num.num / days,
num.num % days,
num.id,
num.num
FROM (
SELECT
*,
row_number() over (order by days DESC) as row_no -- C
FROM
testdata.tiers) tiers, -- A
(SELECT id, num FROM testdata.numbers) num -- B
WHERE row_no = 1
UNION
SELECT
days_per_tier.row_no + 1,
tiers.name,
tiers.days,
tiers.rate,
mods / tiers.days, -- D
mods % tiers.days, -- E
days_per_tier.num_id,
days_per_tier.num
FROM
days_per_tier,
(SELECT
*,
row_number() over (order by days DESC) as row_no
FROM testdata.tiers) tiers
WHERE days_per_tier.row_no + 1 = tiers.row_no
)
SELECT
num_id,
num,
array_agg(name || ': ' || counts ORDER BY days DESC) as days,
sum(total_rate_per_tier) as total_price -- G
FROM (
SELECT
*,
rate * counts as total_rate_per_tier -- F
FROM days_per_tier) s
GROUP BY num_id, num
ORDER BY num_Id
The WITH RECURSIVE contains the starting point of the recursion UNION the recursion part. The starting point simply gets the tiers (A) and numbers (B). To order the tiers due to their days I add a row count (C; only necessary if the corresponding ids are not in the right order as in my example. This could happen if you add another tier).
The recursion part takes the previous SELECT result (which is stored in days_per_tier) and calculates the next remainder and integer division (D, E). All other columns are only for holding the origin values (exception the increasing row counter which is responsible for the recursion itself).
After the recursion the counts and rates are multiplied (F) and then grouped by the origin number id which generated the total sum (G)
Edit:
Added the rate function and the sqlfiddle link.
Here what you need to do is first fire an SQL command to retrieve all condition and write down the function for your business logic.
For Example.
I will fire below query into the database.
Select * from table_name order by nights desc
In result, I will get the data sorted by night in descending order that means first will be 7 then 3 then 1.
I will write a function to write down my business logic for example.
Let's suppose I need to find for 11 days.
I will fetch the first record which will be 7 and check it will 11.
if(11 > 7){// execute this if in a loop till it's greater then 7, same for 3 & 1
days = 11-7;
price += price_from_db;
package += package_from_db;
}else{
// goto fetch next record and check the above condition with next record.
}
Note: I write down an algorithm instead of language-specific code.

How to calculate a running total that is a distinct sum of values

Consider this dataset:
id site_id type_id value date
------- ------- ------- ------- -------------------
1 1 1 50 2017-08-09 06:49:47
2 1 2 48 2017-08-10 08:19:49
3 1 1 52 2017-08-11 06:15:00
4 1 1 45 2017-08-12 10:39:47
5 1 2 40 2017-08-14 10:33:00
6 2 1 30 2017-08-09 07:25:32
7 2 2 32 2017-08-12 04:11:05
8 3 1 80 2017-08-09 19:55:12
9 3 2 75 2017-08-13 02:54:47
10 2 1 25 2017-08-15 10:00:05
I would like to construct a query that returns a running total for each date by type. I can get close with a window function, but I only want the latest value for each site to be summed for the running total (a simple window function will not work because it sums all values up to a date--not just the last values for each site). So I guess it could be better described as a running distinct total?
The result I'm looking for would be like this:
type_id date sum
------- ------------------- -------
1 2017-08-09 06:49:47 50
1 2017-08-09 07:25:32 80
1 2017-08-09 19:55:12 160
1 2017-08-11 06:15:00 162
1 2017-08-12 10:39:47 155
1 2017-08-15 10:00:05 150
2 2017-08-10 08:19:49 48
2 2017-08-12 04:11:05 80
2 2017-08-13 02:54:47 155
2 2017-08-14 10:33:00 147
The key here is that the sum is not a running sum. It should only be the sum of the most recent values for each site, by type, at each date. I think I can help explain it by walking through the result set I've provided above. For my explanation, I'll walk through the original data chronologically and try to explain the expected result.
The first row of the result starts us off, at 2017-08-09 06:49:47, where chronologically, there is only one record of type 1 and it is 50, so that is our sum for 2017-08-09 06:49:47.
The second row of the result is at 2017-08-09 07:25:32, at this point in time we have 2 unique sites with values for type_id = 1. They have values of 50 and 30, so the sum is 80.
The third row of the result occurs at 2017-08-09 19:55:12, where now we have 3 sites with values for type_id = 1. 50 + 30 + 80 = 160.
The fourth row is where it gets interesting. At 2017-08-11 06:15:00 there are 4 records with a type_id = 1, but 2 of them are for the same site. I'm only interested in the most recent value for each site so the values I'd like to sum are: 30 + 80 + 52 resulting in 162.
The 5th row is similar to the 4th since the value for site_id:1, type_id:1 has changed again and is now 45. This results in the latest values for type_id:1 at 2017-08-12 10:39:47 are now: 30 + 80 + 45 = 155.
Reviewing the 6th row is also interesting when we consider that at 2017-08-15 10:00:05, site 2 has a new value for type_id 1, which gives us: 80 + 45 + 25 = 150 for 2017-08-15 10:00:05.
You can get a cumulative total (running total) by including an ORDER BY clause in your window frame.
select
type_id,
date,
sum(value) over (partition by type_id order by date) as sum
from your_table;
The ORDER BY works because
The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
SELECT type_id,
date,
SUM(value) OVER (PARTITION BY type_id ORDER BY type_id, date) - (SUM(value) OVER (PARTITION BY type_id, site_id ORDER BY type_id, date) - value) AS sum
FROM your_table
ORDER BY type_id,
date

Subtract nonconsecutive values in same row in t-SQL

I have a data table that has annual data points and quarterly data points. I want to subtract the quarterly data points from the corresponding prior annual entry, e.g. Annual 2014 - Q3 2014, using t-SQL. I have an id variable for each entry, plus a reconcile id variable that shows which quarterly entry corresponds to which annual entry. See below:
CurrentDate PreviousDate Value Entry Id Reconcile Id Annual/Quarterly
9/30/2012 9/30/2011 112 2 3 Annual
9/30/2013 9/30/2012 123 1 2 Annual
9/30/2014 9/30/2013 123.5 9 1 Annual
12/31/2013 9/30/2014 124 4 1 Quarterly
3/31/2014 12/31/2013 124.5 5 1 Quarterly
6/30/2014 3/31/2014 125 6 1 Quarterly
9/30/2014 6/30/2014 125.5 7 1 Quarterly
12/31/2014 9/30/2014 126 10 9 Quarterly
3/31/2015 12/31/2014 126.5 11 9 Quarterly
6/30/2015 3/31/2015 127 12 9 Quarterly
For example, Reconcile ID 9 for the quarterly entries corresponds to Entry ID 9, which is an annual entry.
I have code to just subtract the prior entry from the current entry, but I cannot figure out how to subtract quarterly entries from annual entries where the Entry ID and Reconcile ID are the same.
Here is the code I am using, which is resulting in the right calculation, but increasing the number of results by many rows. I have also tried this as an inner join. I only want the original 10 rows, plus a new difference column:
SELECT DISTINCT T1.[EntryID]
, [T1].[RECONCILEID]
, [T1].[CurrentDate]
, [T1].[Annual_Quarterly]
, [T1].[Value]
, [T1].[Value]-T2.[Value] AS Difference
FROM Table T1
LEFT JOIN Table T2 ON T2.EntryID = T1.RECONCILEID;
Your code should be fine, here's the results I'm getting:
EntryId Annual_Quarterly CurrentDate ReconcileId Value recVal diff
2 Annual 9/30/2012 3 112
1 Annual 9/30/2013 2 123 112 11
9 Annual 9/30/2014 1 123.5 123 0.5
4 Quarterly 12/31/2013 1 124 123 1
5 Quarterly 3/31/2014 1 124.5 123 1.5
6 Quarterly 6/30/2014 1 125 123 2
7 Quarterly 9/30/2014 1 125.5 123 2.5
10 Quarterly 12/31/2014 9 126 123.5 2.5
11 Quarterly 3/31/2015 9 126.5 123.5 3
12 Quarterly 6/30/2015 9 127 123.5 3.5
with your data and this SQL:
SELECT
tr.EntryId,
tr.Annual_Quarterly,
tr.CurrentDate,
tr.ReconcileId,
tr.Value,
te.Value AS recVal,
tr.[VALUE]-te.[VALUE] AS diff
FROM
t AS tr LEFT JOIN
t AS te ON
tr.ReconcileId = te.EntryId
ORDER BY
tr.Annual_Quarterly,
tr.CurrentDate;
Your question is a bit vague as far as how you're wanting to subtract these values, but this should give you some idea.
Select T1.*, T1.Value - Coalesce(T2.Value, 0) As Difference
From Table T1
Left Join Table T2 On T2.[Entry Id] = T1.[Reconcile Id]

sql Query on effective date

I would like to get report for drink purchased in whole month but price of the drink can change any time in month and I would like to get report for a month with price change
I have two tables
SELECT [ID]
,[DrinkID]
,[UserID]
,[qty]
,[DateTaken]
FROM [Snacks].[dbo].[DrinkHistory]
SELECT [ID]
,[DrinkID]
,[UserID]
,[qty]
,[DateTaken]
FROM [Snacks].[dbo].[DrinkHistory]
[DrinkHistory]:
ID DrinkID UserID qty DateTaken
----------------------------------------------------------------------
1 1 1 1 2014-05-10
2 1 1 2 2014-05-15
3 2 1 1 2014-06-01
4 2 1 4 2014-06-01
5 1 1 3 2014-05-20
6 1 1 4 2014-05-30
[DrinkPricesEffect]:
PriceID DrinkID DrinkPrice PriceEffectiveDate IsCurrent
-----------------------------------------------------------------------------------
1 1 10.00 2014-05-01 1
2 1 20.00 2014-05-20 1
3 2 9.00 2014-06-01 1
4 2 8.00 2014-01-01 1
5 1 30.00 2014-05-25 1
6 1 40.00 2014-05-28 1
I would like to have result as under date taken between 2014-05-1 to 2014-05-31
DrinkId Qty Price DateTaken PriceEffectiveDate
-----------------------------------------------------------------------
1 1 10 2014-05-10 2014-05-01
1 2 10 2014-05-15 2014-05-01
1 3 20 2014-05-20 2014-05-20
1 4 40 2014-05-30 2014-05-28
Is there any who can give me some idea or write query for me?
If your drink price can change any time in a month you could additionaly save the price for each purchase. I would add a column [PricePaid] to the table [DrinkHistory].
When adding a record to [DrinkHistory], the price for the drink at the moment is known, but later it might change so you save the current price to the history...
Then for your result you could just display the Whole [DrinkHistory]
SELECT * FROM DrinkHistory;
This should work:
Select
DH.DrinkId,
DH.Qty,
DPE.DrinkPrice AS Price,
DH.DateTaken,
DPE.PriceEffectiveDate
FROM DrinkHistory DH
JOIN DrinkPricesEffect DPE ON DPE.PriceID =
(
Select Top 1 PriceID FROM
(
Select PriceID,RANK() OVER(ORDER BY PriceEffectiveDate DESC ) AS rnk
FROM DrinkPricesEffect
WHERE DH.DrinkId = DrinkId AND
DH.DateTaken >= PriceEffectiveDate
)SubQ WHERE rnk = 1
)
WHERE DH.DateTaken Between '2014-05-01' AND '2014-05-30'
Here you can find the SQL Fiddle link: http://sqlfiddle.com/#!6/5f8fb/26/0