Monthly cumulative differences calculation in postgres - sql

I am trying to write an sql script in postgres that find cumulative difference in total price and repayment amount. I have two tables as shown below. I have gone through solution provided here but it doesn't address my question.
item table
item_id cost_price date_purchase
1 200 01-06-2019
2 300 10-07-2019
3 250 15-08-2019
4 400 10-09-2019
payment table
item id payment payment date
1 50 01-06-2019
1 40 20-06-2019
2 30 15-07-2019
1 60 17-07-2019
2 100 15-08-2019
3 90 17-08-2019
4 300 20-09-2019
1 50 25-09-2019
Expected result
Month Remaining amount
06_2019 (200 - 90) = 110
07_2019 (200+ 300) - (90 + 30 + 60) = 320
08_2019 (200+ 300 + 250) - (90 + 90 + 100 + 90) = 380
09_2019 (200 + 300 + 250 + 400) - (90 + 90 + 190 + 300 + 50) = 430

You can do that by SUMs with WINDOWING function that's uses ORDER BY month. But give us the DDL of your table to be helped more...

Since your example ignores the item_id in the results, you can combine purchases and payments into a simple ledger and then use a window function to get a running sum:
with ledger as (
select to_char(date_purchase, 'YYYY-MM') as xact_month, cost_price as amount from item
union all
select to_char(payment_date, 'YYYY-MM'), payment * -1 from payment
)
select distinct xact_month as month,
sum(amount) over (order by xact_month) as remaining_amount
from ledger;
Working fiddle.

This is it:
select distinct date_trunc('month',m.date_1),m.num_1 from (select to_date(b."payment date",'DD-MM-YYYY') as date_1,
sum(a."cost_price"+coalesce((select sum("cost_price") from item_table c where
date_trunc('month',to_date(a."date_purchase",'DD-MM-YYYY'))>date_trunc('month',to_date(c."date_purchase",'DD-MM-YYYY'))
),0)-(coalesce((select sum("payment") from payment_table c where
date_trunc('month',to_date(a."date_purchase",'DD-MM-YYYY'))>=date_trunc('month',to_date(c."payment date",'DD-MM-YYYY'))
),0))) as num_1 from item_table a,payment_table b
where date_trunc('month',to_date(a."date_purchase",'DD-MM-YYYY'))
=date_trunc('month',to_date(b."payment date",'DD-MM-YYYY'))
group by 1 order by 1)m;
please check at http://sqlfiddle.com/#!17/428874/37
possibly give it a green tick and an upvote..

Related

Get the total Count and Sum of total Amount from down line record using SQL Server CTE

How can we Count and Sum all down-line rows to their up-line using SQL.
Current data:
ST_ID UPLINE AMOUNT
---------------------------
44930 52001 400
52016 52001 300
52001 9024 432
76985 9024 100
12123 35119 234
12642 35119 213
12332 23141 654
Here in above table, uplinedata 52001 two ST_ID with amount 400 and 300 each with total sum of 700 and ST_ID has 52001 as well with Amount 400, so total amount for 5201 will be 400 + 300 + 432 = 1132 and again upline 9024 has ST_ID of 52001 with 432 + 700 with total of 1132.
Expected Output:
UPLINE AMOUNT CNT
------------------------
52001 1132 2 (400 +300 + 432 | 1+1+1)
9024 1232 4 (700 + 432 + 100 | 2+1+1 = 4)
35119 447 2 (234 + 213 | 1+1 = 2)
23141 654 1
I thought of recursive CTE but could not able to gather the logic. Do anyone have any idea to achieve this. I am using SQL Server 2016.
As I understood, the Upline column is connected to ST_ID column, and you want to find the sum and count grouped by (Upline + all the matched values from ST_ID). i.e. Upline = 9024 is connected to ST_ID = 52001, so the sum for Upline = 9024 will be (432 + 100 from 9024 plus 300 + 400 from 52001).
You could use a recursive CTE as the following:
With CTE As
(
Select ST_ID, Upline, Amount From table_name
Union All
Select T.ST_ID, T.Upline, C.Amount
From table_name T Join CTE C
On C.Upline = T.ST_ID
)
Select Upline,
Sum(Amount) As Amount,
Count(*) As Cnt
From CTE
Group By Upline
See a demo.
Update according to the new requirement (in addition to the sum of the the previous query add the sum of values where ST_ID=Upline):
With CTE As
(
Select * From table_name
Union All
Select T.ST_ID, T.Upline, C.AMOUNT
From table_name T Join CTE C
On C.Upline = T.ST_ID
)
Select C.Upline,
Sum(C.Amount) + ISNULL(Sum(Distinct T.Amount), 0) As Amount,
Count(*) + Count(Distinct T.Amount) As Cnt
From CTE C Left Join table_name T
On C.Upline = T.ST_ID
Group By C.Upline
See demo.

SQL update statement to sum column in one table, then add the total to a different column/table

Evening all, hoping for some pointers with an SQL Server query if possible.
I have two tables in a database, example as follows:
PostedTran
PostedTranID AccountID PeriodID Value TransactionDate
1 100 120 100 2019-01-01
2 100 120 200 2020-01-01
3 100 130 300 2021-01-01
4 101 120 400 2020-01-01
5 101 130 500 2021-01-01
PeriodValue
PeriodValueID AccountID PeriodID ActualValue
10 100 120 500
11 101 120 600
I have a mismatch in the two tables, and I'm failing miserably in my attempts. From the PostedTran table, I'm trying to select all transaction lines dated before 2021-01-01, then sum the Value for each AccountID from the results. I then need to add that value to the existing ActualValue in the PeriodValue table.
So, in the above example, the ActualValue on PeriodValueID 10 will update to 800, and 11 to 1000. The PeriodID in this example is constant and will always be 120.
Thanks in advance for any help.
Since RDMS not mentioned, pseudo-sql looks like:
with DataSum as
(
select AccountID, PeriodID, sum(Value) as TotalValue
from PostedTran
where TransactionDate<'1/1/2021'
group by AccountID, PeriodID
)
update PeriodValue set ActualValue = ActualValue + ds.TotalVaue
from PeriodValue pv inner join DataSum ds
on pv.accountid=ds.accountid and pv.periodid=ds.periodid
The following should do what you ask. I haven't included PeriodId in the correlation as you did not specify it in your description, however you can just include it if it's required.
update pv set pv.ActualValue=pv.ActualValue + t.Value
from PeriodValue pv
cross apply (
select Sum(value) value
from PostedTran pt
where pt.AccountId=pv.AccountId and pt.TransactionDate <'20210101'
)t

SQL How to calculate Average time between Order Purchases? (do sql calculations based on next and previous row)

I have a simple table that contains the customer email, their order count (so if this is their 1st order, 3rd, 5th, etc), the date that order was created, the value of that order, and the total order count for that customer.
Here is what my table looks like
Email Order Date Value Total
r2n1w#gmail.com 1 12/1/2016 85 5
r2n1w#gmail.com 2 2/6/2017 125 5
r2n1w#gmail.com 3 2/17/2017 75 5
r2n1w#gmail.com 4 3/2/2017 65 5
r2n1w#gmail.com 5 3/20/2017 130 5
ation#gmail.com 1 2/12/2018 150 1
ylove#gmail.com 1 6/15/2018 36 3
ylove#gmail.com 2 7/16/2018 41 3
ylove#gmail.com 3 1/21/2019 140 3
keria#gmail.com 1 8/10/2018 54 2
keria#gmail.com 2 11/16/2018 65 2
What I want to do is calculate the time average between purchase for each customer. So lets take customer ylove. First purchase is on 6/15/18. Next one is 7/16/18, so thats 31 days, and next purchase is on 1/21/2019, so that is 189 days. Average purchase time between orders would be 110 days.
But I have no idea how to make SQL look at the next row and calculate based on that, but then restart when it reaches a new customer.
Here is my query to get that table:
SELECT
F.CustomerEmail
,F.OrderCountBase
,F.Date_Created
,F.Total
,F.TotalOrdersBase
FROM #FullBase F
ORDER BY f.CustomerEmail
If anyone can give me some suggestions, that would be greatly appreciated.
And then maybe I can calculate value differences (in percentage). So for example, ylove spent $36 on their first order, $41 on their second which is a 13% increase. Then their second order was $140 which is a 341% increase. So on average, this customer increased their purchase order value by 177%. Unrelated to SQL, but is this the correct way of calculating a metric like this?
looking to your sample you clould try using the diff form min and max date divided by total
select email, datediff(day, min(Order_Date), max(Order_Date))/(total-1) as avg_days
from your_table
group by email
and for manage also the one order only
select email,
case when total-1 > 0 then
datediff(day, min(Order_Date), max(Order_Date))/(total-1)
else datediff(day, min(Order_Date), max(Order_Date)) end as avg_days
from your_table
group by email
The simplest formulation is:
select email,
datediff(day, min(Order_Date), max(Order_Date)) / nullif(total-1, 0) as avg_days
from t
group by email;
You can see this is the case. Consider three orders with od1, od2, and od3 as the order dates. The average is:
( (od2 - od1) + (od3 - od2) ) / 2
Check the arithmetic:
--> ( od2 - od1 + od3 - od2 ) / 2
--> ( od3 - od1 ) / 2
This pretty obviously generalizes to more orders.
Hence the max() minus min().

Creating averages and detecting increases higher than x% in SQL

I want to create the following in SQL Server 2012: (I've found that the best way to explain it is with tables).
I have the date of purchase, the customer id and the price the customer paid in a table like this:
DateOnly Customer Price
2012/01/01 1 50
2012/01/01 2 60
2012/01/01 3 80
2012/01/02 4 40
2012/01/02 5 30
2012/01/02 1 55
2012/01/03 6 80
2012/01/04 2 90
What I need to do then is to keep a register of the average price paid by a customer. Which would be as follows:
DateOnly Customer Price AveragePrice
2012/01/01 1 50 50
2012/01/01 2 60 60
2012/01/01 3 80 80
2012/01/02 4 40 40
2012/01/02 5 30 30
2012/01/02 1 55 52.5
2012/01/03 6 80 80
2012/01/04 2 90 75
And finally, I need to select the rows which have caused an increase higher than 10% in the averageprice paid by a customer.
In this case, the second order of customer 2 should be the only one to be selected, as it introduced an increase higher than 10% in the average price paid by this customer.
Hence, the resulting table should be as follows:
DateOnly Customer Price AveragePrice
2012/01/04 2 90 75
Thanks in advance for your help.
First CTE is to prepare your data = assign row_numbers to each customer's purchase, to be used in joins further.
Second CTE is recursive and it does all the work in process. First part is to get each customer's first purchase and recursive part joins on next purchase and calculates TotalPrice, AveragePrice and Increase.
At the end just select the rows with Increase more than 10%.
WITH CTE_Prep AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY DateOnly) RN
FROM Table1
)
,CTE_Calc AS
(
SELECT *, Price AS TotalPrice, CAST(Price AS DECIMAL(18,2)) AS AveragePrice, CAST (0 AS DECIMAL(18,2)) AS Increase
FROM CTE_Prep WHERE RN = 1
UNION ALL
SELECT p.*
, c.TotalPrice + p.Price AS TotalPrice
, CAST(CAST(c.TotalPrice + p.Price AS DECIMAL(18,2)) / p.RN AS DECIMAL(18,2)) AS AveragePrice
, CAST(CAST(CAST(c.TotalPrice + p.Price AS DECIMAL(18,2)) / p.RN AS DECIMAL(18,2)) / c.AveragePrice AS DECIMAL(18,2)) AS Increase
FROM CTE_Calc c
INNER JOIN CTE_Prep p ON c.RN + 1 = p.RN AND p.Customer = c.Customer
)
SELECT * FROM CTE_Calc
WHERE Increase > 1.10
SQLFiddle DEMO
Interesting problem.
You can get the average without the current purchase by subtracting the price on each row from the sum of all prices for the row. This observation -- in combination with window functions -- gives the information needed to get the rows you are looking for:
select *
from (select t.*,
avg(price) over (partition by customer) as avgprice,
sum(price) over (partition by customer) as sumprice,
count(price) over (partition by customer) as cntprice
from table1 t
) t
where (case when cntprice > 1
then (sumprice - price) / (cntprice - 1)
end) > avgprice*1.1;
Note the use of the case in the where clause. There is a potential divide by zero problem. SQL Server guarantees that the when part of the case will be evaluated before the then part (in the situation). So this is safe from that problem.

sql query to find sum of all rows and count of duplicates

If data is in the following format:
SID TID Tdatetime QID QTotal
----------------------------------------
100 1 01/12/97 9:00AM 66 110
100 1 01/12/97 9:00AM 66 110
100 1 01/12/97 10:00AM 67 110
100 2 01/19/97 9:00AM 66 .
100 2 01/19/97 9:00AM 66 110
100 2 01/19/97 10:00AM 66 110
100 3 01/26/97 9:00AM 68 120
100 3 01/26/97 9:00AM 68 120
110 1 02/03/97 10:00AM 68 110
110 3 02/12/97 9:00AM 64 115
110 3 02/12/97 9:00AM 64 115
120 1 04/05/97 9:00AM 66 105
120 1 04/05/97 10:00AM 66 105
I would like to be able to write a query to sum the QTotal column for all rows and find the count of duplicate rows for the Tdatetime column.
The output would look like:
Year Total Count
97 | 1340 | 4
The third column in the result does not include the count of distinct rows in the table. And the output is grouped by the year in the TDateTime column.
The following query may help:
SELECT
'YEAR ' + CAST(sub.theYear AS VARCHAR(4)),
COUNT(sub.C),
(SELECT SUM(QTotal) FROM MyTable WHERE YEAR(Tdatetime) = sub.theYear) AS total
FROM
(SELECT
YEAR(Tdatetime) AS theYear,
COUNT(Tdatetime) AS C
FROM MyTable
GROUP BY Tdatetime, YEAR(Tdatetime)
HAVING COUNT(Tdatetime) >= 2) AS sub
This will work if you really want to group by the tDateTime column:
SELECT DISTINCT tDateTime, SUM(QTotal), Count(distinct tDateTime)
FROM Table
GROUP BY tDateTime
HAVING Count(distinct tDateTime) > 1
But your results look like you want to group by the Year in the tDateTime column. Is this correct?
If so try this:
SELECT DISTINCT YEAR (tDateTime), SUM(QTotal), Count(distinct tDateTime)
FROM Table
GROUP BY YEAR (tDateTime)
HAVING Count(distinct tDateTime) > 1
You must do SELECT from this table GROUPing by QTotal, using COUNT(subSELECT from this table WHERE QTotal is the same). If I only I had time I would write you SQL statement, but it'll take some minutes.
Something like:
select Year(Tdatetime) ,sum(QTotal), count(1) from table group by year(Tdatetime )
or full date
select Tdatetime ,sum(QTotal), count(1) from table group by year(Tdatetime)
Or your ugly syntax ( : ) )
select 'Year ' + cast(Year(tdatetime) as varchar(4))
+ '|' + cast(sum(QTotal) as varchar(31))
+ '|' + cast(count(1) as varchar(31))
from table group by year(Tdatetime )
Or do you want just the year? Sum all columns? Or just by year?
SELECT
YEar + year(Tdatetime),
SUM ( QTotal ),
(SELECT COUNT(*) FROM (
SELECT Tdatetime FROM tDateTime GROUP BY Tdatetime
HAVING COUNT(QID) > 1) C
FROM
Tdatetime t
GROUP BY
YEar + year(Tdatetime)
This is the first time I have asked a question on stackoverflow. It looks like I have lost my original ID info. I had to register to login and add comments to the question I posted.
To answer OMG Ponies question, this is a SQL Server 2008 database.
#Abe Miessler , the row with SID 120 does not contain duplicates. the first row for SID 120 shows 9:00AM in the datetime column , and the second row shows 10:00AM.
#Zafer, your query is the accepted answer. I made a few minor tweaks to get it to work. Thanks.
Thanks due to Abe Miessler and the others for your help.