Calculating progressive pricing in PostgreSQL - sql

I need to calculate revenue based on how many items a user has.
So for example, the first 10 items are free, up to 100 items are 0.50, up to 200 are 0.25 and up to 500 are 0.15 for example.
I have no idea where to start with this, can I get some direction please?
EG. If a user has 365 items, this would be (10 * 0) + (90 * 0.5) + (100 * 0.25) + (165 * 0.15)
Ideally I'd be doing this in python or something, but the dashboarding tool doesn't have that capability...
EDIT:
I should have mentioned that the number of items isn't actually the number they have, it's the limit they have chosen. The limit is saved as a single number in a subscription event. So for each user I will have an integer representing their max items eg. 365

First number items using window function row_number,
then use a case expression to assign a proper value for each item.
Simple example: http://sqlfiddle.com/#!17/32e4a/9
SELECT user_id,
SUM(
CASE
WHEN rn <= 10 THEN 0
WHEN rn <= 100 THEN 0.5
WHEN rn <= 200 THEN 0.25
WHEN rn <= 500 THEN 0.15
ELSE 0.05
END
) As revenue
FROM (
SELECT *,
row_number() OVER (partition by user_id order by item_no ) As rn
FROM mytable
) x
GROUP BY user_id
I should have mentioned that the number of items isn't actually the
number they have, it's the limit they have chosen. The limit is saved
as a single number in a subscription event. So for each user I will
have an integer representing their max items eg. 365
In this case the below query probably fits your needs:
Demo: http://sqlfiddle.com/#!17/e7a6a/2
SELECT *,
(SELECT SUM(
CASE
WHEN rn <= 10 THEN 0
WHEN rn <= 100 THEN 0.5
WHEN rn <= 200 THEN 0.25
WHEN rn <= 500 THEN 0.15
ELSE 0.05
END
)
FROM generate_series(1,t.user_limit) rn
)
FROM mytab t;

Related

How would I select multiple summed columns each with their own condition in Postgres?

Basically:
I have a number representing an amount of time in minutes, we'll call my_minutes
I have an aircraft type that any record within the table must first match on to be qualified (WHERE)
There are 12 months worth of Minutes data for each month, in the form of month_01_minutes, month_02_minutes, month_03_minutes...
If a particular field for that month is within +/- 10% of the provided number of minutes, add that to the sum
If the value isn't less than/greater than 10% of my provided amount of minutes (my_minutes), return 0 for that select column in particular
At the end, I'd like to sum up each of the selected values for a grand total of everything
select 
   sum( if(month_01_minutes <= 0.9 * my_minutes and month_01_minutes >= 1.1 * my_minutes, month_01_minutes, 0 ) ),
sum( if(month_02_minutes <= 0.9 * my_minutes and month_02_minutes >= 1.1 * my_minutes, month_02_minutes, 0 ) ),
where tableName.aircraft_type = providedAircraftType
Table with all of the minute data
I've tried it with just one column, but the "where" clause of this just returns a zero value despite there being a field with a value that is within +/- 10% of 225
select
sum(case when c.month_01_minutes <= 0.9 * 225 and c.month_01_minutes >= 1.1 * 225 then c.month_01_minutes else 0 end)
from fumes_schema.consumption c
where c.aircraft_type = 'xyz';
The predicate was wrong, a number could not be smaller than 90% and bigger than 110% at same time,it would always return false.
select
sum(case when /* c.month_01_minutes <= 0.9 * 225 and */ c.month_01_minutes >= 1.1 * 225 then c.month_01_minutes else 0 end)
from fumes_schema.consumption c
where c.aircraft_type = 'xyz';

SQL - Calculate percentage by group, for multiple groups

I have a table in GBQ in the following format :
UserId Orders Month
XDT 23 1
XDT 0 4
FKR 3 6
GHR 23 4
... ... ...
It shows the number of orders per user and month.
I want to calculate the percentage of users who have orders, I did it as following :
SELECT
HasOrders,
ROUND(COUNT(*) * 100 / CAST( SUM(COUNT(*)) OVER () AS float64), 2) Parts
FROM (
SELECT
*,
CASE WHEN Orders = 0 THEN 0 ELSE 1 END AS HasOrders
FROM `Table` )
GROUP BY
HasOrders
ORDER BY
Parts
It gives me the following result:
HasOrders Parts
0 35
1 65
I need to calculate the percentage of users who have orders, by month, in a way that every month = 100%
Currently to do this I execute the query once per month, which is not practical :
SELECT
HasOrders,
ROUND(COUNT(*) * 100 / CAST( SUM(COUNT(*)) OVER () AS float64), 2) Parts
FROM (
SELECT
*,
CASE WHEN Orders = 0 THEN 0 ELSE 1 END AS HasOrders
FROM `Table` )
WHERE Month = 1
GROUP BY
HasOrders
ORDER BY
Parts
Is there a way execute a query once and have this result ?
HasOrders Parts Month
0 25 1
1 75 1
0 45 2
1 55 2
... ... ...
SELECT
SIGN(Orders),
ROUND(COUNT(*) * 100.000 / SUM(COUNT(*), 2) OVER (PARTITION BY Month)) AS Parts,
Month
FROM T
GROUP BY Month, SIGN(Orders)
ORDER BY Month, SIGN(Orders)
Demo on Postgres:
https://dbfiddle.uk/?rdbms=postgres_10&fiddle=4cd2d1455673469c2dfc060eccea8020
You've stated that it's important for the total to be 100% so you might consider rounding down in the case of no orders and rounding up in the case of has orders for those scenarios where the percentages falls precisely on an odd multiple of 0.5%. Or perhaps rounding toward even or round smallest down would be better options:
WITH DATA AS (
SELECT SIGN(Orders) AS HasOrders, Month,
COUNT(*) * 10000.000 / SUM(COUNT(*)) OVER (PARTITION BY Month) AS PartsPercent
FROM T
GROUP BY Month, SIGN(Orders)
ORDER BY Month, SIGN(Orders)
)
select HasOrders, Month, PartsPercent,
PartsPercent - TRUNCATE(PartsPercent) AS Fraction,
CASE WHEN HasOrders = 0
THEN FLOOR(PartsPercent) ELSE CEILING(PartsPercent)
END AS PartsRound0Down,
CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5
AND MOD(TRUNCATE(PartsPercent), 2) = 0
THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
END AS PartsRoundTowardEven,
CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5 AND PartsPercent < 50
THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
END AS PartsSmallestTowardZero
from DATA
It's usually not advisable to test floating-point values for equality and I don't know how BigQuery's float64 will work with the comparison against 0.5. One half is nevertheless representable in binary. See these in a case where the breakout is 101 vs 99. I don't have immediate access to BigQuery so be aware that Postgres's rounding behavior is different:
https://dbfiddle.uk/?rdbms=postgres_10&fiddle=c8237e272427a0d1114c3d8056a01a09
Consider below approach
select hasOrders, round(100 * parts, 2) as parts, month from (
select month,
countif(orders = 0) / count(*) `0`,
countif(orders > 0) / count(*) `1`,
from your_table
group by month
)
unpivot (parts for hasOrders in (`0`, `1`))
with output like below

How to find value in a range of following rows - SQL Teradata

I have a table with the following columns:
account, validity_date,validity_month,amount.
For each row i want to check if the value in field "amount' exist over the rows range of the next month. if yes, indicator=1, else 0.
account validity_date validity_month amount **required_column**
------- ------------- --------------- ------- ----------------
123 15oct2019 201910 400 0
123 20oct2019 201910 500 1
123 15nov2019 201911 1000 0
123 20nov2019 201911 500 0
123 20nov2019 201911 2000 1
123 15dec2019 201912 400
123 15dec2019 201912 2000
Can anyone help?
Thanks
validity_month/100*12 + validity_month MOD 100 calculates a month number (for comparing across years, Jan to previous Dec) and the inner ROW_NUMBER reduces multiple rows with the same amount per month to a single row (kind of DISTINCT):
SELECT dt.*
,CASE -- next row is from next month
WHEN Lead(nextMonth IGNORE NULLS)
Over (PARTITION BY account, amount
ORDER BY validity_date)
= (validity_month/100*12 + validity_month MOD 100) +1
THEN 1
ELSE 0
END
FROM
(
SELECT t.*
,CASE -- one row per account/month/amount
WHEN Row_Number()
Over (PARTITION BY account, amount, validity_month
ORDER BY validity_date ) = 1
THEN validity_month/100*12 + validity_month MOD 100
END AS nextMonth
FROM tab AS t
) AS dt
Edit:
The previous is for exact matching amounts, for a range match the query is probably very hard to write with OLAP-functions, but easy with a Correlated Subquery:
SELECT t.*
,CASE
WHEN
( -- check if there's a row in the next month matching the current amount +/- 10 percent
SELECT Count(*)
FROM tab AS t2
WHERE t2.account_ = t.account_
AND (t2.validity_month/100*12 + t2.validity_month MOD 100)
= ( t.validity_month/100*12 + t.validity_month MOD 100) +1
AND t2.amount BETWEEN t.amount * 0.9 AND t.amount * 1.1
) > 0
THEN 1
ELSE 0
END
FROM tab AS t
But then performance might be really bad...
Assuming the values are unique within a month and you have a value for each month for each account, you can simplify this to:
select t.*,
(case when lead(seqnum) over (partition by account, amount order by validity_month) = seqnum + 1
then 1 else 0
end)
from (select t.*,
dense_rank() over (partition by account order by validity_month) as seqnum
from t
) t;
Note: This puts 0 for the last month rather than NULL, but that can easily be adjusted.
You can do this without the subquery by using month arithmetic. It is not clear what the data type of validity_month is. If I assume a number:
select t.*,
(case when lead(floor(validity_month / 100) * 12 + (validity_month mod 100)
) over (partition by account, amount order by validity_month) =
(validity_month / 100) * 12 + (validity_month mod 100) - 1
then 1 else 0
end)
from t;
Just to add another way to do this using Standard SQL. This query will return 1 when the condition is met, 0 when it is not, and null when there isn't a next month to evaluate (as implied in your result column).
It is assumed that we're partitioning on the account field. Also includes a 10% range match on the amount field based on the comment made. Note that if you have an id field, you should include it (if two rows have the same account, validity_date, validity_month, amount there will only be one resulting row, due to DISTINCT).
Performance-wise, should be similar to the answer from #dnoeth.
SELECT DISTINCT
t1.account,
t1.validity_date,
t1.validity_month,
t1.amount,
CASE
WHEN t2.amount IS NOT NULL THEN 1
WHEN MAX(t1.validity_month) OVER (PARTITION BY t1.account) > t1.validity_month THEN 0
ELSE NULL
END AS flag
FROM `project.dataset.table` t1
LEFT JOIN `project.dataset.table` t2
ON
t2.account = t1.account AND
DATE_DIFF(
PARSE_DATE("%Y%m", CAST(t2.validity_month AS STRING)),
PARSE_DATE("%Y%m", CAST(t1.validity_month AS STRING)),
MONTH
) = 1 AND
t2.amount BETWEEN t1.amount * 0.9 AND t1.amount * 1.1;

sql (beginner) - use value calculated from above cell

EDIT
the values in the table can be negative numbers (sorry for the oversight when asking the question)
Having exhausted all search efforts, I am very stuck with the following:
I would like to calculate a running total based on the initial value. For instance:
My table would look like:
Year Percent Constant
==== ===== ========
2000 1.40 100
2001 -1.08 100
2002 1.30 100
And the desired results would be:
Year Percent Constant RunningTotal
==== ====== ======== ============
2000 1.40 100 140
2001 -1.08 100 128.8
2002 1.30 100 167.44
Taking the calculated value of 1.40*100 and multiplying it with percent of the next line, 1.08 and so on.
I am using Sql Server 2012. I've looked into using a common table expression, but can't seem to get the correct syntax sadly.
In SQL Server 2012+, you would use a cumulative sum:
select t.*,
(const * sum(1 + percent / 100) over (order by year)) as rolling_sum
from t
order by t.year;
EDIT:
Ooops, I notice you really seem to want a cumulative product. Assuming percent is always greater than 0, then just use logs:
select t.*,
(const * exp(sum(log(1 + percent / 100)) over (order by year))) as rolling_product
from t
order by t.year;
You can accomplish this task using a recursive CTE
;WITH values_cte AS (
SELECT [Year]
,[Percent]
,[Constant]
,CASE WHEN [v].[Percent] < 0 THEN
[v].[Constant] - (([v].[Percent] + 1) * [v].[Constant])
ELSE
[v].[Percent] * [v].[Constant]
END
AS [RunningTotal]
FROM [#tmp_Values] v
WHERE [v].[Year] = 2000
UNION ALL
SELECT v2.[Year]
,v2.[Percent]
,v2.[Constant]
,CASE WHEN [v2].[Percent] < 0 THEN
[v].[RunningTotal] + (([v2].[Percent] + 1) * [v].[RunningTotal])
ELSE
[v2].[Percent] * [v].[RunningTotal]
END
AS [RunningTotal]
FROM values_cte v
INNER JOIN [#tmp_Values] v2 ON v2.[Year] = v.[Year] + 1
)
SELECT *
FROM [values_cte]
use LEAD keyword
SELECT
Year
, Percent
, Constant
, Percent * Constant * (LEAD(Percent) OVER(ORDER BY Year)) as RunningTotal
FROMYourTable
this is new keyword from MSSQL 2012

SQL need to apply a variable rate if values occur over consecutive time periods

I have a table that looks like this:
Within the query I need to find the Maximum Import value that occurs over two time periods (rows) where the value is greater that a defined Threshold and apply a rate. If it happens over more than two time periods a different rate will be used
Threshold = 1000
Rate 1 (2 consecutive) = 100
Rate 2 (> 2 consecutive) = 200
Id DateTime Import Export Total
1 2016-01-13 00:00 1000 500 1500
2 2016-01-13 00:15 2500 100 3000
3 2016-01-13 00:30 1900 200 2100
4 2016-01-13 01:00 900 100 1200
Ids 2 and 3 are > Threshold so the query should return the MIN value of those (2500,1900) = 1900 minus the Threshold (1000) = 900. Apply the rate Rate1 * 900 = 9000
If we change the value of Id 4 to 1200 then the MIN value would be 1200. Less the threshold = 200. 200 * Rate2 = 4000
Any help would be greatly appreciated!
Update after feedback. My challenge appears to be that I'm not grabbing the 2nd highest value. Here is an example of the dataset:
Dataset example
I added another var to shrink the list down to test gap and island portion. Here is a smaller subset:
Subset
Here is the code:
WITH CTE AS (
SELECT LogTable.[LocalTimestamp] as thetime,LogTable.[SystemImport] as import, LogTable.[Id] - ROW_NUMBER() OVER (ORDER BY LogTable.[Id]) AS grp
FROM {System_KWLogRaw} LogTable
WHERE LogTable.[SystemImport] between #DemandThreshold and #In1 and
DATEPART(year,#inDate) = DATEPART(year, LogTable.[LocalTimestamp]) and
DATEPART(month,#inDate) = DATEPART(month, LogTable.[LocalTimestamp]) and
DATEPART(day,#inDate) = DATEPART(day, LogTable.[LocalTimestamp])
),
counted AS (
SELECT *, COUNT(*) OVER (PARTITION BY grp) AS cnt
FROM CTE
)
SELECT MAX(counted.import) as again1
FROM counted
WHERE cnt > 3 and counted.import < (SELECT MAX(counted.import) FROM counted)
This returns 3555.53 instead of 3543.2 which is the 2nd highest value
This will do what you're asking for:
with x as (
select
t1.Id,
t1.DateTime,
t1.Import,
t1.Export,
t1.Total,
count(t2.Import) over (partition by 1) as [QualifyingImports],
min(t2.Import) over (partition by 1) as [MinQualifyingImport]
from
myTable t1
left join myTable t2 on t2.Import > 1000 and t2.Id = t1.Id
where
t1.DateTime >= '2016-01-13'
and t1.DateTime < dateadd(d, 1,'2016-01-13')
)
select
x.Id,
x.DateTime,
x.Import,
x.Export,
x.Total,
case when x.[QualifyingImports] > 2 then (x.MinQualifyingImport - 1000) * 200 else (x.MinQualifyingImport - 1000) * 100 end as [Rate]
from x
I've put together a Fiddle so you can play around with different values for Id # 4.
I really wanted to make the values of things like threshold and period into #variables, but it doesn't appear to be supported inside CTEs so I just had to hard code them.
EDIT
Turns out the CTE is overkill, you can shrink it down to this and use #variables, yay!
declare #period smalldatetime = '2016-01-13'
declare #threshold float = 1000
declare #rate1 float = 100
declare #rate2 float = 200
select
t1.Id,
t1.DateTime,
t1.Import,
t1.Export,
t1.Total,
case
when count(t2.Import) over (partition by 1) > 2 then (min(t2.Import) over (partition by 1) - #threshold) * #rate2
else (min(t2.Import) over (partition by 1) - #threshold) * #rate1
end as [Rate]
from
myTable t1
left join myTable t2 on t2.Import > #threshold and t2.Id = t1.Id
where
t1.DateTime >= #period
and t1.DateTime < dateadd(d, 1, #period)
New Fiddle