Running Total on date column - sql

I have the following data in my table:
id invoice_id date ammount
1 1 2012-01-01 100.00
20 1 2012-01-31 50.00
470 1 2012-01-15 300.00
Now, I need to calculate running total for an invoice in some period. So, the output for this data sample should look like this:
id invoice_id date ammount running_total
1 1 2012-01-01 100.00 100.00
470 1 2012-01-15 300.00 400.00
20 1 2012-01-31 50.00 450.00
I tried with this samples http://www.sqlusa.com/bestpractices/runningtotal/ and several others, but the problem is that I could have entries like id 20, date 2012-01-31 and id 120, date 2012-01-01, and then I couldn't use NO = ROW_NUMBER(over by date)... in first select and then ID < NO in second select for calculating running total.

DECLARE #DateStart DATE='2012-01-01';
WITH cte
AS (SELECT id = Row_number() OVER(ORDER BY [date]),
DATE,
myid = id,
invoice_id,
orderdate = CONVERT(DATE, DATE),
ammount
FROM [Table_2]
WHERE DATE >= #DateStart)
SELECT myid,
invoice_id,
DATE,
ammount,
runningtotal = (SELECT SUM(ammount)
FROM cte
WHERE id <= a.id)
FROM cte AS a
ORDER BY id

Related

SQL query to generate summary file base on change in price per item

I need help writing a query to generate a summary file of quantity purchase per item, and per cost from a purchase history file. To run the query the ORDER BY would be ITEM_NO, PO_DATE, AND COST.
SAMPLE DATE - PURCHASE HISTORY
OUTPUT FILE - SUMMARY
We can group by item_no and cost and get all the info we need.
select item_no
,cost
,min(po_date) as start_date
,max(po_date) as end_date
,sum(qty) as qty
from (
select *
,count(chng) over(partition by item_no order by po_date) as grp
from (
select *
,case when lag(cost) over(partition by item_no order by po_date) <> cost then 1 end as chng
from t
) t
) t
group by item_no, cost, grp
order by item_no, start_date
item_no
cost
start_date
end_date
qty
12345
1.25
2021-01-02 00:00:00
2021-01-04 00:00:00
150
12345
2.00
2021-02-01 00:00:00
2021-02-03 00:00:00
60
78945
5.25
2021-06-10 00:00:00
2021-06-12 00:00:00
90
78945
4.50
2021-10-18 00:00:00
2021-10-19 00:00:00
150
Fiddle

Group by date and find median of processing time

I select input date and output date from a database. I use a formula to indicate the processing time. Now, I would like the values ​​to be grouped according to the date of receipt and the median of the processing time to be output for all grouped dates of receipt. Something like this:
The data I select:
input date | output date | processing time
2022-01-03 | 2022-01-03 | 0
2022-01-03 | 2022-01-06 | 3
2022-01-03 | 2022-01-11 | 8
2022-01-05 | 2022-01-10 | 5
2022-01-05 | 2022-01-15 | 10
The output I want:
input date | processing time
2022-01-03 | 3
2022-01-05 | 7.5
My SQL Code:
SELECT [received_date]
,CONVERT(date, [exported_on])
,DATEDIFF(day, [received_date], [exported_on]) AS processing_time
FROM [request] WHERE YEAR (received_date) = 2022
GROUP BY received_date, [exported_on]
ORDER BY received_date
How can I do this? Do I need a temp table to do this, or can I modify my query?
You could try using PERCENTILE_CONT
with cte as (
select input_date,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY processing_time) OVER(PARTITION BY input_date) as Median_Process_Time
FROM tableA
)
SELECT *
FROM cte
GROUP BY input_date, Median_Process_Time
db fiddle
Also you check check out the discussion here How to find the SQL medians for a grouping
Here my solution. Thank you for your help.
SET NOCOUNT ON;
DECLARE #working TABLE(entry_date date, exit_date date, work_time int)
INSERT INTO #working
SELECT [received] AS date_of_entry
,CONVERT(date, [exported]) AS date_of_exit
,DATEDIFF(day, [received], [exported]) AS processing_time
FROM [zsdt].[dbo].[antrag] WHERE YEAR([received]) = 2022 AND scanner_name IS NOT NULL AND exportiert_am IS NOT NULL AND NOT scanner_name = 'AP99'
GROUP BY [received], [exported]
ORDER BY [received] ASC
;WITH CTE AS
( SELECT entry_date,
work_time,
[half1] = NTILE(2) OVER(PARTITION BY entry_date ORDER BY work_time),
[half2] = NTILE(2) OVER(PARTITION BY entry_date ORDER BY work_time DESC)
FROM #working
WHERE work_time IS NOT NULL
)
SELECT entry_date,
(MAX(CASE WHEN Half1 = 1 THEN work_time END) +
MIN(CASE WHEN Half2 = 1 THEN work_time END)) / 2.0
FROM CTE
GROUP BY entry_date;

Aggregate a subtotal column based on two dates of that same row

Situation:
I have 5 columns
id
subtotal (price of item)
order_date (purchase date)
updated_at (if refunded or any other status change)
status
Objective:
I need the order date as column 1
I need to get the subtotal for each day regardless if of the status as column 2
I need the subtotal amount for refunds for the third column.
Example:
If a purchase is made on May 1st and refunded on May 3rd. The output should look like this
+-------+----------+--------+
| date | subtotal | refund |
+-------+----------+--------+
| 05-01 | 10.00 | 0.00 |
| 05-02 | 00.00 | 0.00 |
| 05-03 | 00.00 | 10.00 |
+-------+----------+--------+
while the row will look like that
+-----+----------+------------+------------+----------+
| id | subtotal | order_date | updated_at | status |
+-----+----------+------------+------------+----------+
| 123 | 10 | 2019-05-01 | 2019-05-03 | refunded |
+-----+----------+------------+------------+----------+
Query:
Currently what I have looks like this:
Note: Timezone discrepancy therefore bring back the dates by 8 hours.
;with cte as (
select id as orderid
, CAST(dateadd(hour,-8,order_date) as date) as order_date
, CAST(dateadd(hour,-8,updated_at) as date) as updated_at
, subtotal
, status
from orders
)
select
b.dates
, sum(a.subtotal_price) as subtotal
, -- not sure how to aggregate it to get the refunds
from Orders as o
inner join cte as a on orders.id=cte.orderid
inner join (select * from cte where status = ('refund')) as b on o.id=cte.orderid
where dates between '2019-05-01' and '2019-05-31'
group by dates
And do I need to join it twice? Hopefully not since my table is huge.
This looks like a job for a Calendar Table. Bit of a stab in the dark, but:
--Overly simplistic Calendar table
CREATE TABLE dbo.Calendar (CalendarDate date);
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1, N N2, N N3, N N4, N N5) --Many years of data
INSERT INTO dbo.Calendar
SELECT DATEADD(DAY, T.I, 0)
FROM Tally T;
GO
SELECT C.CalendarDate AS [date],
CASE C.CalendarDate WHEN V.order_date THEN subtotal ELSE 0 END AS subtotal,
CASE WHEN C.CalendarDate = V.updated_at AND V.[status] = 'refunded' THEN subtotal ELSE 0.00 END AS subtotal
FROM (VALUES(123,10.00,CONVERT(date,'20190501'),CONVERT(date,'20190503'),'refunded'))V(id,subtotal,order_date,updated_at,status)
JOIN dbo.Calendar C ON V.order_date <= C.CalendarDate AND V.updated_at >= C.CalendarDate;
GO
DROP TABLE dbo.Calendar;
Consider joining on a recursive CTE of sequential dates:
WITH dates AS (
SELECT CONVERT(datetime, '2019-01-01') AS rec_date
UNION ALL
SELECT DATEADD(d, 1, CONVERT(datetime, rec_date))
FROM dates
WHERE rec_date < '2019-12-31'
),
cte AS (
SELECT id AS orderid
, CAST(dateadd(hour,-8,order_date) AS date) as order_date
, CAST(dateadd(hour,-8,updated_at) AS date) as updated_at
, subtotal
, status
FROM orders
)
SELECT rec_date AS date,
CASE
WHEN c.order_date = d.rec_date THEN subtotal
ELSE 0
END AS subtotal,
CASE
WHEN c.updated_at = d.rec_date THEN subtotal
ELSE 0
END AS refund
FROM cte c
JOIN dates d ON d.rec_date BETWEEN c.order_date AND c.updated_at
WHERE c.status = 'refund'
option (maxrecursion 0)
GO
Rextester demo

PostgreSQL group by with interval

Well, I have a seemingly simple set of data but it gives me a lot of trouble.
This is an example of what my data look like:
quantity price1 price2 date
100 1 0 2018-01-01 10:00:00
200 1 0 2018-01-02 10:00:00
50 5 0 2018-01-02 11:00:00
100 1 1 2018-01-03 10:00:00
100 1 1 2018-01-03 11:00:00
300 1 0 2018-01-03 12:00:00
I need to sum up "quantity" column grouped by "price1" and "price2" and it would be very easy but I need to take into account time changes of "price1" and "price2". Data is sorted by "date".
What I need is the last row to be not grouped with the first two although it has the same values for "price1" and "price2". Also I need to get minimal and maximal date of each interval.
The end result should looks like this:
quantity price1 price2 dateStart dateEnd
300 1 0 2018-01-01 10:00:00 2018-01-02 10:00:00
50 5 0 2018-01-02 11:00:00 2018-01-02 11:00:00
200 1 1 2018-01-03 10:00:00 2018-01-03 11:00:00
300 1 0 2018-01-03 12:00:00 2018-01-03 12:00:00
Any suggestions for a SQL query?
It is a gap and island problem. Use the following code:
select sum(quantity), price1, price2, min(date) dateStart, max(date) dateend
from
(
select *,
row_number() over (order by date) -
row_number() over (partition by price1, price2 order by date) grp
from data
) t
group by price1, price2, grp
order by dateStart
dbfiddle demo
The solution is based on an identification of consecutive sequences of price1 and price2, which is done by a creation of the grp column. Once you isolate the consecutive sequences then you do a simple group by using grp as well.
I changed a little bit the accepted answer to catch the cases when "date" column of two rows next to each other are exactly the same. I added second parameter so they will be ordered in correct order (my table has "oid" column)
select sum(quantity), price1, price2, min(date) dateStart, max(date) dateend
from
(
select *,
row_number() over (order by date, oid) -
row_number() over (partition by price1, price2 order by date, oid) grp
from data
) t
group by price1, price2, grp
order by dateStart

Get last two entries of each account in table

I've got script that gives me all transactions for day for all accounts and sub accounts. His return you can see on the image. What I want, is return result as two last transactions for each accountId and subaccountId. Ideal return would be:
AccountId| SubAccountId| AmountInDay | Date
---------------------------------------------
210 | 1 | 0.00 |2017-06-20 00:00:00.000
210 | 1 | 0.00 |2017-06-05 00:00:00.000
1234 | 1 | 0.00 |2017-06-20 00:00:00.000
1234 | 1 | 0.00 |2017-06-05 00:00:00.000
This is the code of my script:
with CTE1 as
(
select top 2 AccountId, SubAccountId, [Date], sum(Amount_Amount) as Amount
from dbo.PayoutInstallment
group by accountId, SubAccountId, [Date]
)
, CTE2 as
(
select AccountId,SubAccountId, Amount_Amount, [Date],
dense_rank() over (partition by AccountId order by [Date] desc) as rn
from dbo.PayoutInstallment
)
select a1.AccountId,a1.SubAccountId, Sum(a1.Amount_Amount) as AmountInDay, a1.[Date]
from CTE2 a1
left join CTE2 a2
on a1.AccountId = a2.AccountId and a1.[Date] > a2.[Date]
and a2.rn = a1.rn+1
group by a1.[Date], a1.AccountId, a1.SubAccountId
order by a1.[Date] desc
EDIT
Sample Data
AccountId| SubAccountId| AmountInDay | Date
---------------------------------------------
210 | 1 | 0.00 |2017-03-15 00:00:00.000
210 | 1 | 0.00 |2017-04-20 00:00:00.000
210 | 1 | 100.00 |2017-05-17 00:00:00.000
210 | 1 | 1.00 |2017-06-05 00:00:00.000
210 | 1 | 1.00 |2017-06-05 00:00:00.000
1234 | 1 | 0.00 |2017-06-05 00:00:00.000
1234 | 1 | 0.00 |2017-06-05 00:00:00.000
1234 | 1 | 1.00 |2017-06-10 00:00:00.000
1234 | 1 | 1.00 |2017-04-10 00:00:00.000
I think you can use row_number and get 2 records as below:
Select * from (
Select AccountId, SubAccountId, [Date], sum(Amount_Amount) over (partition by accountid, SubAccountId, [Date])
,RowN = Row_number() over (partition by accountid, SubAccountId, [Date] order by [date] desc)
from dbo.PayoutInstallment
) a where a.RowN <= 2
Assume one day one transaction,
;WITH cte AS(SELECT *
, ROW_NUMBER() OVER (PARTITION BY AccountId, SubAccountId ORDER BY [Date] DESC) AS Rownum
FROM PayoutInstallment
)
SELECT *
, SUM(AmountInDay) OVER (PARTITION BY AccountId, SubAccountId) AS SumLast2days
FROM cte
WHERE Rownum<=2
If you want the SUM for the last two day you need to assign a number to each day. Then bring all the data related to those days by JOIN both dataset and then perform a GROUP BY
WITH cte as (
SELECT AccountId, SubAccountId, [Date],
ROW_NUMBER() OVER (PARTITION BY AccountId, SubAccountId
ORDER BY [Date] DESC) AS rn
FROM dbo.PayoutInstallment
)
SELECT P.AccountId,
P.SubAccountId,
P.[Date],
SUM(ammount)
FROM dbo.PayoutInstallment P
JOIN cte C
ON P.[Date] = C.[Date]
AND P.AccountId = C.AccountId
AND P.SubAccountId = C.SubAccountId
WHERE rn <= 2 -- Just the last day of each account, subacount
GROUP BY P.AccountId,
P.SubAccountId,
P.[Date]
I see you are using GROUP BY, so if you want the results to be sorted after the grouping, you should use HAVING if you want otherwise you should use WHERE. Here is an example of a WHERE clause you can use in your query to get only results between the last two days.
WHERE (a1.[Date] BETWEEN GETDATE()AND GETDATE()-2)