I want to query the table so that the running total is repeatedly
carrying over to the latest period as long as the value dose not fall to 0.
Assuming I have a table with values as such below:
Name
Period
Value
A
02/2022
2
A
03/2022
5
A
04/2022
3
A
05/2022
7
B
02/2022
9
B
04/2022
6
I want my result to be:
| Name | Period | Value|
| A| 02/2022| 2 |
| A| 03/2022| 7 |
| A| 04/2022| 10|
| A| 05/2022| 17|
| B| 02/2022| 9 |
| B| 03/2022| 9 |
| B| 04/2022| 15|
| B| 05/2022| 15|
My current query is:
SELECT
PERIOD
,NAME
,SUM(SUM(Value)) OVER (PARTITION BY NAME ORDER BY PERIOD) AS balance
FROM
table
WHERE Period < CURRENT_DATE()
GROUP BY
1
,2
This results in the value stopping at the latest period the activity occurred as such:
| Name | Period | Value|
| A | 02/2022| 2 |
| A | 03/2022| 7 |
| A | 04/2022| 10 |
| A | 05/2022| 17 |
| B | 02/2022| 9 |
| B | 04/2022| 15 |
OK, you haven't had an answer in a full day so even though I work in TSQL I'll try a solution that's ANSI SQL compatible. Work with me if my syntax is off a bit.
Before we start, check your "Desired Output", you're currently showing a running total for A on 3/22 of 5, but you had a 2 for A on 2/22 so it should be a running total of 7, right?
Anyway, assuming that's just a typo, I'd approach this by making a few CTEs that build a list of all {PERIOD, NAME} pairs you want reported, then JOIN your actual data to that. There are a number of ways to generate the dates, the easiest is to use DISTINCT if your actual data is fairly robust, but I can describe other methods if that assumption does not hold for your data.
So with all that in mind, here is my solution. I put your sample data in a CTE for portability, just replace my "cteTabA" with whatever your data table is really named
--Code sample data as a CTE for portability
;with cteTabA as (
SELECT *
FROM ( VALUES
('A', '02/2022', '2')
, ('A', '03/2022', '5')
, ('A', '04/2022', '3')
, ('A', '05/2022', '7')
, ('B', '02/2022', '9')
, ('B', '04/2022', '6')
) as TabA(Name, Period, Value)
) --END of sample data, actual query below
--First, build a list of periods to use. If your data set is full, just select DISTINCT
, cteDates as ( --but there are other ways if this doesn't work for you - let me know!
SELECT DISTINCT Period FROM cteTabA
) --Next, build a list of names to report on
, cteNames as (
SELECT DISTINCT Name FROM cteTabA
) --Now build your table that has all periods for all names
, cteRepOn as (
SELECT * FROM cteNames CROSS JOIN cteDates
)--Now assemble a table that has entries for each period for each name,
--but fill in zeroes for those you don't actually have data for
, cteFullList as (
SELECT L.*, COALESCE(D.Value, 0) as Value
FROM cteRepOn as L
LEFT OUTER JOIN cteTabA as D on L.Name = D.Name AND L.Period = D.Period
)--Now your query works as expected with the gaps filled in
SELECT PERIOD, NAME, Value
,SUM(Value) OVER (PARTITION BY NAME ORDER BY PERIOD) AS balance
FROM cteFullList
WHERE Period < '06/2022'--CURRENT_DATE()
ORDER BY NAME, PERIOD
This produces an output as follows
PERIOD
NAME
Value
balance
02/2022
A
2
2
03/2022
A
5
7
04/2022
A
3
10
05/2022
A
7
17
02/2022
B
9
9
03/2022
B
0
9
04/2022
B
6
15
05/2022
B
0
15
Related
lets say I have a table which stores itemID, Date and total_shipped over a period of time:
ItemID | Date | Total_shipped
__________________________________
1 | 1/20/2000 | 2
2 | 1/20/2000 | 3
1 | 1/21/2000 | 5
2 | 1/21/2000 | 4
1 | 1/22/2000 | 1
2 | 1/22/2000 | 7
1 | 1/23/2000 | 5
2 | 1/23/2000 | 6
Now I want to aggregate based on several periods of time. For example, I Want to know how many of each item was shipped every two days and in total. So the desired output should look something like:
ItemID | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
_____________________________________________
1 | 7 | 6 | 13
2 | 7 | 13 | 20
How do I do that in the most efficient way
I know I can make three different subqueries but I think there should be a better way. My real data is large and there are several different time periods to be considered i. e. in my real problem I want the shipped items for current_week, last_week, two_weeks_ago, three_weeks_ago, last_month, two_months_ago, three_months_ago so I do not think writing 7 different subqueries would be a good idea.
Here is the general idea of what I can already run but is very expensive for the database
WITH
sq1 as (
SELECT ItemID, sum(Total_shipped) sum1
FROM table
WHERE Date BETWEEN '1/20/2000' and '1/21/2000'
GROUP BY ItemID),
sq2 as (
SELECT ItemID, sum(Total_Shipped) sum2
FROM table
WHERE Date BETWEEN '1/22/2000' and '1/23/2000'
GROUP BY ItemID),
sq3 as(
SELECT ItemID, sum(Total_Shipped) sum3
FROM Table
GROUP BY ItemID)
SELECT ItemID, sq1.sum1, sq2.sum2, sq3.sum3
FROM Table
JOIN sq1 on Table.ItemID = sq1.ItemID
JOIN sq2 on Table.ItemID = sq2.ItemID
JOIN sq3 on Table.ItemID = sq3.ItemID
I dont know why you have tagged this question with multiple database.
Anyway, you can use conditional aggregation as following in oracle:
select
item_id,
sum(case when "date" between date'2000-01-20' and date'2000-01-21' then total_shipped end) as "Jan20-Jan21",
sum(case when "date" between date'2000-01-22' and date'2000-01-23' then total_shipped end) as "Jan22-Jan23",
sum(case when "date" between date'2000-01-20' and date'2000-01-23' then total_shipped end) as "Jan20-Jan23"
from my_table
group by item_id
Cheers!!
Use FILTER:
select
item_id,
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-21') as "Jan20-Jan21",
sum(total_shipped) filter (where date between '2000-01-22' and '2000-01-23') as "Jan22-Jan23",
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-23') as "Jan20-Jan23"
from my_table
group by 1
item_id | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
---------+-------------+-------------+-------------
1 | 7 | 6 | 13
2 | 7 | 13 | 20
(2 rows)
Db<>fiddle.
I have a table with columns t_b; t_e; x were [t_b, t_e) denotes a period during which x resources where used. I want to compute a table were for each hour h I have amount of resources that where used during [h, h+1) period.
So far my only idea was to generate multiple rows from each input row for each hour (I use an extension of SQL with UDFs) and then simply group by by hour, but I'm afraid this may be too slow considering large amount of data at hand.
Say for example I have a table with two rows:
+-----+-----+---+
| t_b | t_e | x |
+-----+-----+---+
| 1 | 3.5 | a |
| 0.5 | 4 | b |
+-----+-----+---+
Then resulting table should be:
+---+-------------+
| h | x |
+---+-------------+
| 0 | 0*a + 0.5*b |
| 1 | 1*a + 1*b |
| 2 | 1*a + 1*b |
| 3 | 0.5*a + 1*b |
+---+-------------+
You can have a trigger on insert into the stats table that also adds to the aggregate table (the per-hour sums).
If you also need to convert the existing data, you need to run over every row of your current table, split it into amounts/hours and add to the aggregate table.
This is an sql-server example for all number columns
with h as (
-- your hours tally here
select top(24) row_number() over(order by (select null)) eoh from sys.all_objects
), myTable as (
select 1 t_b, 3.5 t_e, 20 v union all
select 0.5, 4, 40
)
select eoh-1 h_starth
, sum(v * (case when t_e < eoh then t_e else eoh end - case when t_b > eoh-1 then t_b else eoh-1 end)) usage
from h
left join myTable t on t_e > eoh - 1 and eoh > t_b -- [..) intresection with [..)
group by eoh;
Fiddle
Using PostgreSQL 9.4, I have a table like this:
CREATE TABLE products
AS
SELECT id::uuid, title, kind, created_at
FROM ( VALUES
( '61c5292d-41f3-4e86-861a-dfb5d8225c8e', 'foo', 'standard' , '2017/04/01' ),
( 'def1d3f9-3e55-4d1b-9b42-610d5a46631a', 'bar', 'standard' , '2017/04/02' ),
( 'cc1982ab-c3ee-4196-be01-c53e81b53854', 'qwe', 'standard' , '2017/04/03' ),
( '919c03b5-5508-4a01-a97b-da9de0501f46', 'wqe', 'standard' , '2017/04/04' ),
( 'b3d081a3-dd7c-457f-987e-5128fb93ce13', 'tyu', 'other' , '2017/04/05' ),
( 'c6e9e647-e1b4-4f04-b48a-a4229a09eb64', 'ert', 'irregular', '2017/04/06' )
) AS t(id,title,kind,created_at);
Need to split the data into n same size parts. if this table had a regular id will be easier, but since it has uuid then I can't use modulo operations (as far as I know).
So far I did this:
SELECT * FROM products
WHERE kind = 'standard'
ORDER BY created_at
LIMIT(
SELECT count(*)
FROM products
WHERE kind = 'standard'
)/2
OFFSET(
(
SELECT count(*)
FROM products
WHERE kind = 'standard'
)/2
)*1;
Works fine but doing the same query 3 times I don't think is a good idea, the count is not "expensive" but every time someone wants to modify/update the query will need to do it in the 3 sections.
Note that currently n is set as 2 and the offset is set to 1 but both can take other values. Also limit rounds down so there may be a missing value, I can fix it using other means but having it on the query will be nice.
You can see the example here
Just to dispel a myth you can never use an serial and modulus to get parts because a serial isn't guaranteed to be gapless. You can use row_number() though.
SELECT row_number() OVER () % 3 AS parts, * FROM products;
parts | id | title | kind | created_at
-------+--------------------------------------+-------+-----------+------------
1 | 61c5292d-41f3-4e86-861a-dfb5d8225c8e | foo | standard | 2017/04/01
2 | def1d3f9-3e55-4d1b-9b42-610d5a46631a | bar | standard | 2017/04/02
0 | cc1982ab-c3ee-4196-be01-c53e81b53854 | qwe | standard | 2017/04/03
1 | 919c03b5-5508-4a01-a97b-da9de0501f46 | wqe | standard | 2017/04/04
2 | b3d081a3-dd7c-457f-987e-5128fb93ce13 | tyu | other | 2017/04/05
0 | c6e9e647-e1b4-4f04-b48a-a4229a09eb64 | ert | irregular | 2017/04/06
(6 rows)
This won't get equal parts unless parts goes into count equally.
I'm stucking for a solution at the problem of finding daily profits from db (ms access) table. The difference wrt other tips I found online is that I don't have in the table a field "Price" and one "Cost", but a field "Type" which distinguish if it is a revenue "S" or a cost "C"
this is the table "Record"
| Date | Price | Quantity | Type |
-----------------------------------
|01/02 | 20 | 2 | C |
|01/02 | 10 | 1 | S |
|01/02 | 3 | 10 | S |
|01/02 | 5 | 2 | C |
|03/04 | 12 | 3 | C |
|03/03 | 200 | 1 | S |
|03/03 | 120 | 2 | C |
So far I tried different solutions like:
SELECT
(SELECT SUM (RS.Price* RS.Quantity)
FROM Record RS WHERE RS.Type='S' GROUP BY RS.Data
) as totalSales,
(SELECT SUM (RC.Price*RC.Quantity)
FROM Record RC WHERE RC.Type='C' GROUP BY RC.Date
) as totalLosses,
ROUND(totalSales-totaleLosses,2) as NetTotal,
R.Date
FROM RECORD R";
in my mind it could work but obviously it doesn't
and
SELECT RC.Data, ROUND(SUM (RC.Price*RC.QuantitY),2) as DailyLoss
INTO #DailyLosses
FROM Record RC
WHERE RC.Type='C' GROUP BY RC.Date
SELECT RS.Date, ROUND(SUM (RS.Price*RS.Quantity),2) as DailyRevenue
INTO #DailyRevenues
FROM Record RS
WHERE RS.Type='S'GROUP BY RS.Date
SELECT Date, DailyRevenue - DailyLoss as DailyProfit
FROM #DailyLosses dlos, #DailyRevenues drev
WHERE dlos.Date = drev.Date";
My problem beyond the correct syntax is the approach to this kind of problem
You can use grouping and conditional summing. Try this:
SELECT data.Date, data.Income - data.Cost as Profit
FROM (
SELECT Record.Date as Date,
SUM(IIF(Record.Type = 'S', Record.Price * Record.Quantity, 0)) as Income,
SUM(IIF(Record.Type = 'C', Record.Price * Record.Quantity, 0)) as Cost,
FROM Record
GROUP BY Record.Date
) data
In this case you first create a sub-query to get separate fields for Income and Cost, and then your outer query uses subtraction to get actual profit.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Calculate a Running Total in SqlServer
Consider this data
Day | OrderCount
1 3
2 2
3 11
4 3
5 6
How can i get this accumulation of OrderCount(running value) resultset using T-SQL query
Day | OrderCount | OrderCountRunningValue
1 3 3
2 2 5
3 11 16
4 3 19
5 6 25
I Can easily do this with looping in the actual query (using #table) or in my C# codebehind but its so slow (Considering that i also get the orders per day) when im processing thousand of records so i'm looking for better / more efficient approach hopefully without loops something like recursing CTE or something else.
Any idea would be greatly appreciated. TIA
As you seem to need these results in the client rather than for use within another SQL query, you are probably better off Not doing this in SQL.
(The linked question in my comment shows 'the best' option within SQL, if that is infact necessary.)
What may be recommended is to pull the Day and OrderCount values as one result set (SELECT day, orderCount FROM yourTable ORDER BY day) and then calculate the running total in your C#.
Your C# code will be able to iterate through the dataset efficiently, and will almost certainly outperform the SQL approaches. What this does do, is to transfer some load from the SQL Server to the web-server, but at an overall (and significant) resource saving.
SELECT t.Day,
t.OrderCount,
(SELECT SUM(t1.OrderCount) FROM table t1 WHERE t1.Day <= t.Day)
AS OrderCountRunningValue
FROM table t
SELECT
t.day,
t.orderCount,
SUM(t1.orderCount) orderCountRunningValue
FROM
table t INNER JOIN table t1 ON t1.day <= t.day
group by t.day,t.orderCount
CTE's to the rescue (again):
DROP TABLE tmp.sums;
CREATE TABLE tmp.sums
( id INTEGER NOT NULL
, zdate timestamp not null
, amount integer NOT NULL
);
INSERT INTO tmp.sums (id,zdate,amount) VALUES
(1, '2011-10-24', 1 ),(1, '2011-10-25', 2 ),(1, '2011-10-26', 3 )
,(2, '2011-10-24', 11 ),(2, '2011-10-25', 12 ),(2, '2011-10-26', 13 )
;
WITH RECURSIVE list AS (
-- Terminal part
SELECT t0.id, t0.zdate
, t0.amount AS amount
, t0.amount AS runsum
FROM tmp.sums t0
WHERE NOT EXISTS (
SELECT * FROM tmp.sums px
WHERE px.id = t0.id
AND px.zdate < t0.zdate
)
UNION
-- Recursive part
SELECT p1.id AS id
, p1.zdate AS zdate
, p1.amount AS amount
, p0.runsum + p1.amount AS runsum
FROM tmp.sums AS p1
, list AS p0
WHERE p1.id = p0.id
AND p0.zdate < p1.zdate
AND NOT EXISTS (
SELECT * FROM tmp.sums px
WHERE px.id = p1.id
AND px.zdate < p1.zdate
AND px.zdate > p0.zdate
)
)
SELECT * FROM list
ORDER BY id, zdate;
The output:
DROP TABLE
CREATE TABLE
INSERT 0 6
id | zdate | amount | runsum
----+---------------------+--------+--------
1 | 2011-10-24 00:00:00 | 1 | 1
1 | 2011-10-25 00:00:00 | 2 | 3
1 | 2011-10-26 00:00:00 | 3 | 6
2 | 2011-10-24 00:00:00 | 11 | 11
2 | 2011-10-25 00:00:00 | 12 | 23
2 | 2011-10-26 00:00:00 | 13 | 36
(6 rows)