Runing Total sum minus different condition - sql

I looked at some SQL Server running total examples, but I can't manage thing like this.
I have a table where I have columns id, name, type of operation, date, value.
I want to calculate balance for each record. Balance should be calculated like this:
Starting balance must be 0 and then if operation type is IN there will be plus, if operation is OUT there will be minus. Each next record should see previous record balance and then +value or -value depending on operation Type.
This operation should be ordered by date (not Id).
For example, if the table looks like this:
ID Name Op_Type Date Value
1 box Out 2017-05-13 15
2 table In 2017-04-31 65
3 box2 In 2017-05-31 65
then result should look like this
ID Name Op_Type Date Value Balance
2 table In 2017-04-31 65 65
1 box Out 2017-05-13 15 50
3 box2 In 2017-05-31 65 115
result of this code :
select *,
sum(case when Op_Type = 'Out' then -Value else Value end)Over(Order by [Date]) as Balance
From Yourtable
is:
ID Date Type Value Balance
143 2016-12-31 In 980 664.75
89 2016-12-31 Out 300 664.75
90 2016-12-31 Out 80 664.75
But I expect the following result:
ID Date Type Value Balance
143 2016-12-31 In 980 980
89 2016-12-31 Out 300 680
90 2016-12-31 Out 80 600

The problem with answer by Prdp is that SUM(...) OVER (ORDER BY ...) by default uses RANGE option instead of ROW.
This is why you see unexpected results when dates are not unique. This is how the default RANGE option works.
To get results that you expect spell it our explicitly:
SELECT
*
,SUM(CASE WHEN Op_Type = 'Out'
THEN -Value ELSE Value END)
OVER(ORDER BY [Date], Op_Type, ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Balance
FROM YourTable
ORDER BY [Date], Op_Type, ID;
I also added Op_Type into the ORDER BY to add positive values first in cases when there are several rows with the same date.
I added ID into the ORDER BY to make results stable in cases when there are several rows with the same date.

Related

Select the rows where the Tier value is different than the previous row. Sorted by Account # and Date

I am wanting to select a value from a table in SQL Server only where there is actually a change from the previous vale in the table. Example below:
Assume the below is named Table A:
Account # Tier Date
10000 1 1/1/2020
10000 1 1/2/2020
10000 1 1/3/2020
10000 2 2/1/2020
10000 2 2/2/2020
10000 3 3/1/2020
10000 2 4/1/2020
I want to return the below:
Account # Tier Date
10000 1 1/1/2020
10000 2 2/1/2020
10000 3 3/1/2020
10000 2 4/1/2020
You can use the LAG window function to compare the value in a column with the previous row's value. You need to supply PARTITION BY and ORDER BY arguments so the window function knows what row should be the previous row as far as the comparison is concerned. In this example, the partition is the Account # column and the order by is the date column. If the value of the tier column is different between the current row and the previous row, then the row is included in the output. Fiddle.
;WITH cte AS (
SELECT *, LAG(Tier) OVER (PARTITION BY [Account #] ORDER BY date) LastTier
FROM #t
)
SELECT cte.[Account #],
cte.Tier,
cte.Date
FROM cte
WHERE cte.Tier <> cte.LastTier OR cte.LastTier IS NULL

SQL How to calculate Average time between Order Purchases? (do sql calculations based on next and previous row)

I have a simple table that contains the customer email, their order count (so if this is their 1st order, 3rd, 5th, etc), the date that order was created, the value of that order, and the total order count for that customer.
Here is what my table looks like
Email Order Date Value Total
r2n1w#gmail.com 1 12/1/2016 85 5
r2n1w#gmail.com 2 2/6/2017 125 5
r2n1w#gmail.com 3 2/17/2017 75 5
r2n1w#gmail.com 4 3/2/2017 65 5
r2n1w#gmail.com 5 3/20/2017 130 5
ation#gmail.com 1 2/12/2018 150 1
ylove#gmail.com 1 6/15/2018 36 3
ylove#gmail.com 2 7/16/2018 41 3
ylove#gmail.com 3 1/21/2019 140 3
keria#gmail.com 1 8/10/2018 54 2
keria#gmail.com 2 11/16/2018 65 2
What I want to do is calculate the time average between purchase for each customer. So lets take customer ylove. First purchase is on 6/15/18. Next one is 7/16/18, so thats 31 days, and next purchase is on 1/21/2019, so that is 189 days. Average purchase time between orders would be 110 days.
But I have no idea how to make SQL look at the next row and calculate based on that, but then restart when it reaches a new customer.
Here is my query to get that table:
SELECT
F.CustomerEmail
,F.OrderCountBase
,F.Date_Created
,F.Total
,F.TotalOrdersBase
FROM #FullBase F
ORDER BY f.CustomerEmail
If anyone can give me some suggestions, that would be greatly appreciated.
And then maybe I can calculate value differences (in percentage). So for example, ylove spent $36 on their first order, $41 on their second which is a 13% increase. Then their second order was $140 which is a 341% increase. So on average, this customer increased their purchase order value by 177%. Unrelated to SQL, but is this the correct way of calculating a metric like this?
looking to your sample you clould try using the diff form min and max date divided by total
select email, datediff(day, min(Order_Date), max(Order_Date))/(total-1) as avg_days
from your_table
group by email
and for manage also the one order only
select email,
case when total-1 > 0 then
datediff(day, min(Order_Date), max(Order_Date))/(total-1)
else datediff(day, min(Order_Date), max(Order_Date)) end as avg_days
from your_table
group by email
The simplest formulation is:
select email,
datediff(day, min(Order_Date), max(Order_Date)) / nullif(total-1, 0) as avg_days
from t
group by email;
You can see this is the case. Consider three orders with od1, od2, and od3 as the order dates. The average is:
( (od2 - od1) + (od3 - od2) ) / 2
Check the arithmetic:
--> ( od2 - od1 + od3 - od2 ) / 2
--> ( od3 - od1 ) / 2
This pretty obviously generalizes to more orders.
Hence the max() minus min().

Get the next and prev row data manipulations in SQL Server

I have a data like below format in table:
Id EmployeeCode JobNumber TransferNo FromDate Todate
--------------------------------------------------------------------------
1 127 1.0 0 01-Mar-19 10-Mar-19
2 127 1.0 NULL 11-Mar-19 15-Mar-19
3 127 J-1 1 16-Mar-19 NULL
4 136 1.0 0 01-Mar-19 15-Mar-19
5 136 J-1 1 16-Mar-19 20-Mar-19
6 136 1.0 2 21-Mar-19 NULL
And I want result like this:
Id EmployeeCode JobNumber TransferNo FromDate Todate
--------------------------------------------------------------------------
2 127 1.0 NULL 01-Mar-19 15-Mar-19
3 127 J-1 1 16-Mar-19 NULL
4 136 1.0 0 01-Mar-19 15-Mar-19
5 136 J-1 1 16-Mar-19 20-Mar-19
6 136 1.0 2 21-Mar-19 NULL
The idea is
If Job is same in continuous than Single row with max id with min date and max date. For example, for employee 127 first job and second job number is same and second and third row is different, then the first and second row will be returned, with minimum fromdate and max todate, and third row will be returned as is.
If job number is different with its next job number than all rows will be returned.
For example: for employee 136: first job number is different with second, second is different with third, so all rows will be returned.
You can group by jobNumber and EmployeeCode and use the Max/Min-Aggregate-Functions to get the dates you want
I doubt you will get a result from simple set-based queries.
So my advice: Declare a cursor on SELECT DISTINCT EmployeeCode .... Within that cursor select all rows with that EmployeeCode. Work in this set to figure out your values and construct a resultset from that.
This is an example of a gaps and islands problem. The solution here is to define the "islands" by their starts, so the process is:
determine when a new grouping begins (i.e. no overlap with previous row)
do a cumulative sum of the the starts to get the grouping value
aggregate
This looks like
select max(id), EmployeeCode, JobNumber,
min(fromdate), max(todate)
from (select t.*,
sum(case when fromdate = dateadd(day, 1, prev_todate) then 0 else 1 end) over
(partition by EmployeeCode, JobNumber order by id
) as grouping
from (select t.*,
lag(todate) over (partition by EmployeeCode, JobNumber order by id) as prev_todate
from t
) t
) t
group by grouping, EmployeeCode, JobNumber;
It is unclear what the logic is for TransferNo. The simplest solution is just min() or max(), but that will not return NULL.

Snapshot Table Status Change

I am trying to write a sql query (in amazon redshift) that counts the number of times that customer goes from not meeting criteria to meeting criteria, so when a 1 occurs the date after a 0.
I'm stuggling to figure out the logic to do this
ID Snapshot_date Meets Criteria
55 1/1/2018 0
55 1/5/2018 1
55 1/10/2018 1
55 1/15/2018 1
55 1/20/2018 0
55 1/25/2018 1
Use lag to get the previous value,check for the conditions and count.
select id,count(*)
from (select id,snapshot_date
,lag(meets_critetria,1) over(partition by id order by snapshot_date) as prev_m_c
from tbl
) t
where prev_m_c = 0 and meets_criteria = 1
group by id

How to select the first row that met condition

I have the following View in PostgreSQL:
idshipment idorder quantity_order date quantity_in_shipment percent_sent
50 1 1020 1.1.16 432 42
51 1 1020 17.1.16 299 71
51 1 1020 20.1.16 144 85
51 1 1020 45.1.16 145 100
52 2 1 3.1.17 5 100
This View shows shipments per order.
For example:
idorder=1 was sent by 4 shipments:
quantity in first shipment is 432 which means 42% of order was sent
quantity in second shipment is 299 which means 71% of order was sent
quantity in third shipment is 144 which means 85% of order was sent
quantity in forth shipment is 145 which means 100% of order was sent
I need a query which will show me the first date where each order was sent above 75%. meaning each order shows only one row.
For the above data I should see:
idorder date
1 20.1.16 (cause its 85% first time above 75%)
2 3.1.17 (cause its 100% first time above 75%)
How can i do that?
You can use distinct on:
select distinct on (t.idshipment) t.*
from t
where t.percent_sent >= 75
order by t.idshipment, t.percent_sent asc;
Try something like this:
SELECT iorder, MIN("date") AS "date"
FROM your_view
WHERE percent_sent >= 75
GROUP BY iorder
use group by to get only one record per idorder and MIN() to aggregate date by selecting the earliest date
I created a table call shipment that has data like you provided:
and execute this query
SELECT s.idorder, MIN(s.date) as date
FROM shipment s
WHERE percent_sent >= 75
GROUP BY s.idorder
result:
idorder date
----------- ----------
1 2016-01-20
2 2017-03-01