SQL : Sum by criteria - sql

I'm working with Oracle and cannot achieve the query I need for the moment.
Suppose I have the following table :
- ID Date Type Value
- 1 01/12/2016 prod 1
- 2 01/01/2017 test 10
- 3 01/06/2017 test 20
- 4 01/12/2017 prod 30
- 5 15/12/2017 test 40
- 6 01/01/2018 test 50
- 7 01/06/2018 test 60
- 8 01/12/2018 prod 70
I need to sum the VALUES between the "prod" TYPES + the last "prod" VALUE.
The results should be :
- 1 01/01/2016 - 1
- 2 01/01/2017 - 60
- 3 01/06/2017 - 60
- 4 01/12/2017 - 60
- 5 15/12/2017 - 220
- 6 01/01/2018 - 220
- 7 01/06/2018 - 220
- 8 01/12/2018 - 220
I first had to sum VALUES by YEAR without taking TYPES into account.
The need changed and I don't see how to start to identify, for each line, which is the previous "prod" DATE and sum each VALUE including the last "prod" TYPE.
Thanks

You can define the groups using a cumulative sum on type = 'PROD' -- in reverse, then use a window function for the final summation:
select t.*,
sum(value) over (partition by grp) as total
from (select t.*,
sum(case when type = 'PROD' then 1 else 0 end) over (order by id desc) as grp
from t
) t
order by id;
To see the grouping logic, look at:
ID Date Type Value Grp
1 01/12/2016 prod 1 3
2 01/01/2017 test 10 2
3 01/06/2017 test 20 2
4 01/12/2017 prod 30 2
5 15/12/2017 test 40 1
6 01/01/2018 test 50 1
7 01/06/2018 test 60 1
8 01/12/2018 prod 70 1
This identifies the groups that need to be summed. The DESC is because "prod" ends a group. If "prod" started a group (i.e. was included with the sum on the next row), then ASC would be used.
Rextester Demo

Gordon Linoff's answer is great.
This below is just for a bit of a different flavor(12c+)
Setup:
ALTER SESSION SET NLS_DATE_FORMAT = 'DD/MM/YYYY';
CREATE TABLE TEST_TABLE(
THE_ID INTEGER,
THE_DATE DATE,
THE_TYPE CHAR(4),
THE_VALUE INTEGER);
INSERT INTO TEST_TABLE VALUES (1,TO_DATE('01/12/2016'),'prod',1);
INSERT INTO TEST_TABLE VALUES (2,TO_DATE('01/01/2017'),'test',10);
INSERT INTO TEST_TABLE VALUES (3,TO_DATE('01/06/2017'),'test',20);
INSERT INTO TEST_TABLE VALUES (4,TO_DATE('01/12/2017'),'prod',30);
INSERT INTO TEST_TABLE VALUES (5,TO_DATE('15/12/2017'),'test',40);
INSERT INTO TEST_TABLE VALUES (6,TO_DATE('01/01/2018'),'test',50);
INSERT INTO TEST_TABLE VALUES (7,TO_DATE('01/06/2018'),'test',70);
INSERT INTO TEST_TABLE VALUES (8,TO_DATE('01/12/2018'),'prod',60);
COMMIT;
Query:
SELECT
THE_ID, THE_DATE, MAX(RUNNING_GROUP_SUM) OVER (PARTITION BY THE_MATCH_NUMBER) AS GROUP_SUM
FROM TEST_TABLE
MATCH_RECOGNIZE (
ORDER BY THE_ID
MEASURES
MATCH_NUMBER() AS THE_MATCH_NUMBER,
RUNNING SUM(THE_VALUE) AS RUNNING_GROUP_SUM
ALL ROWS PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (TEST_TARGET{0,} PROD_TARGET)
DEFINE TEST_TARGET AS THE_TYPE = 'test',
PROD_TARGET AS THE_TYPE = 'prod')
ORDER BY THE_ID ASC;
Result:
THE_ID THE_DATE GROUP_SUM
---------- ---------- ----------
1 01/12/2016 1
2 01/01/2017 60
3 01/06/2017 60
4 01/12/2017 60
5 15/12/2017 220
6 01/01/2018 220
7 01/06/2018 220
8 01/12/2018 220

Related

How to extract timestamp differences in hours and do cumulative sum

CREATE TABLE test (
id NUMBER(10),
due_dt TIMESTAMP(6),
status VARCHAR2(10),
created_on TIMESTAMP(6),
act_taken_on TIMESTAMP(6)
);
insert into test values(1,'21-SEP-22 02.53.10.016537 AM','created','19-SEP-22 02.53.10.016537 AM','20-SEP-22 02.53.10.016537 AM');
insert into test values(1,'21-SEP-22 02.53.10.016537 AM','created','20-SEP-22 02.53.10.016537 AM','21-SEP-22 02.53.10.016537 AM');
insert into test values(2,'21-SEP-22 02.53.10.016537 AM','Approved','21-SEP-22 02.53.10.016537 AM','22-SEP-22 02.53.10.016537 AM');
DB Version: Oracle SQL Developer 18c
I have one table from which I need to calculate the difference between the timestamp and need to show only the hours differences. As per the below explanation:
Need to populate below columns:
aging : When status is created then take differences between created_on and act_taken_on and it should show only the hours differences. If the id is repeating then it should sum up with the previous value.
In my sample data set for the id 1 there are two rows so the for the first row aging column will be 24 hrs approx. and for the next id 1 it should sum up with the previous value which is 24 hrs and give the result as 48 hrs approx.
time_left: difference between due_dt and sysdate and should show differences only in hours.
My attempt:
SELECT id,
CASE
when status = 'created' then (created_on - act_taken_on)
when status = 'Approved' then (created_on - act_taken_on)
end aging,
(due_dt - sysdate) time_left
from test;
Expected output:
+----+---------+-----------+
| id | Aging | time_left |
+----+---------+-----------+
| 1 | 24Hours | 48hours |
| 1 | 48Hours | 48Hours |
| 2 | 24Hours | 48Hours |
+----+---------+-----------+
Here's one option, which casts timestamps as date as you only need rounded hours and uses sum function in its analytic form.
Sample data:
SQL> select * from test order by id, created_on;
ID DUE_DT STATUS CREATED_ON ACT_TAKEN_ON
--- ------------------------- ---------- ------------------------- -------------------------
1 21.09.22 02:53:10,016537 created 19.09.22 02:53:10,016537 20.09.22 02:53:10,016537
1 21.09.22 02:53:10,016537 created 20.09.22 02:53:10,016537 21.09.22 02:53:10,016537
2 21.09.22 02:53:10,016537 Approved 21.09.22 02:53:10,016537 22.09.22 02:53:10,016537
Right now is
SQL> select sysdate from dual;
SYSDATE
-------------------
19.09.2022 13.17:07
Query:
SQL> select id,
2 round (
3 sum (
4 (cast (act_taken_on as date) - cast (created_on as date)) * 24)
5 over (partition by id order by created_on),
6 0) aging,
7 round ((cast (due_dt as date) - sysdate) * 24, 0) time_left
8 from test;
ID AGING TIME_LEFT
--- ---------- ----------
1 24 38
1 48 38
2 24 38
SQL>
As you commented that you'd want to include CASE expression that regards the status column (but you're getting some errors), I'm not sure what exactly you meant to do with that, but - here's how (time_left has changed as sysdate now returns 20.09.2022 07:02):
SQL> select id,
2 sum (
3 case
4 when status = 'created'
5 then
6 (cast (act_taken_on as date) - cast (created_on as date))
7 * 24
8 when status = 'Approved'
9 then
10 (cast (act_taken_on as date) - cast (created_on as date))
11 * 24
12 end)
13 over (partition by id order by created_on) aging,
14 round ((cast (due_dt as date) - sysdate) * 24, 0) time_left
15 from test;
ID AGING TIME_LEFT
---------- ---------- ----------
1 24 20
1 48 20
2 24 20
SQL>
Convert your TIMESTAMP values to DATEs, then subtract them and take the FLOOR (that is, round down).
When you subtract one DATE from another you get a floating-point value that's the difference between them in calendar days.
Like this:
FLOOR(24*(CAST(due_dt AS DATE) - CAST(created_on AS DATE)))

How to get latest records based on two columns of max

I have a table called Inventory with the below columns
item warehouse date sequence number value
111 100 2019-09-25 12:29:41.000 1 10
111 100 2019-09-26 12:29:41.000 1 20
222 200 2019-09-21 16:07:10.000 1 5
222 200 2019-09-21 16:07:10.000 2 10
333 300 2020-01-19 12:05:23.000 1 4
333 300 2020-01-20 12:05:23.000 1 5
Expected Output:
item warehouse date sequence number value
111 100 2019-09-26 12:29:41.000 1 20
222 200 2019-09-21 16:07:10.000 2 10
333 300 2020-01-20 12:05:23.000 1 5
Based on item and warehouse, i need to pick latest date and latest sequence number of value.
I tried with below code
select item,warehouse,sequencenumber,sum(value),max(date) as date1
from Inventory t1
where
t1.date IN (select max(date) from Inventory t2
where t1.warehouse=t2.warehouse
and t1.item = t2.item
group by t2.item,t2.warehouse)
group by t1.item,t1.warehouse,t1.sequencenumber
Its working for latest date but not for latest sequence number.
Can you please suggest how to write a query to get my expected output.
You can use row_number() for this:
select *
from (
select
t.*,
row_number() over(
partition by item, warehouse
order by date desc, sequence_number desc, value desc
) rn
from mytable t
) t
where rn = 1

Computing rolling average and standard deviation by dates

I have the below table where I will need to compute the rolling average and standard deviation based on the dates. I have listed below the tables and expected results. I am trying to compute the rolling average for an id based on date. rollAvgA is computed based on metricA. For example, for the first occurrence of id for a particular date the result should return zero as it does not have any preceding values. Please let me know how this can be accomplished?
Current Table :
Date id metricA
8/1/2019 100 2
8/2/2019 100 3
8/3/2019 100 2
8/1/2019 101 2
8/2/2019 101 3
8/3/2019 101 2
8/4/2019 101 2
Expected Table :
Date id metricA rollAvgA
8/1/2019 100 2 0
8/2/2019 100 3 2.5
8/3/2019 100 2 2.3
8/1/2019 101 2 0
8/2/2019 101 3 2.5
8/3/2019 101 2 2.3
8/4/2019 101 2 2.25
You seem to want a cumulative average. This is basically:
select t.*,
avg(metricA * 1.0) over (partition by id order by date) as rollingavg
from t;
The only caveat is that the first value is an average of one value. To handle this, use a case expression:
select t.*,
(case when row_number() over (partition by id order by date) > 1
then avg(metricA * 1.0) over (partition by id order by date)
else 0
end) as rollingavg
from t;

SQL sum and previous row [duplicate]

This question already has answers here:
Calculate a Running Total in SQL Server
(15 answers)
Closed 3 years ago.
I have the following table:
________________________
date | amount
________________________
01-01-2019 | 10
01-01-2019 | 10
01-01-2019 | 10
01-01-2019 | 10
02-01-2019 | 5
02-01-2019 | 5
02-01-2019 | 5
02-01-2019 | 5
03-01-2019 | 20
03-01-2019 | 20
These are mutation values by date. I would like my query to return the summed amount by date. So for 02-01-2019 I need 40 ( 4 times 10) + 20 ( 4 times 5). For 03-01-2019 I would need ( 4 times 10) + 20 ( 4 times 5) + 40 ( 2 times 20) and so on. Is this possible in one query? How do I achieve this?
My current query to get the individual mutations:
Select s.date,
Sum(s.amount) As Sum_amount
From dbo.Financieel As s
Group By s.date
You can try below -
DEMO
select dateval,
SUM(amt) OVER(ORDER BY dateval ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as amt
from
(
SELECT
dateval,
SUM(amount) amt
FROM t2 group by dateval
)A
OUTPUT:
dateval amt
01/01/2019 00:00:00 40
01/02/2019 00:00:00 60
01/03/2019 00:00:00 100
Try this below script to get your desired output-
SELECT A.date,
(SELECT SUM(amount) FROM <your_table> WHERE Date <= A.Date) C_Total
FROM <your_table> A
GROUP BY date
ORDER BY date
Output is-
date C_Total
01-01-2019 40
02-01-2019 60
03-01-2019 100
I suggest to use a window function, like this:
select date, sum(amount) over( order by date)
from table

SQL difference in counter between two dates

I have a table like the below:
ID, MachineID Customer TimeStamp Counter type
1 A ABC 2017-10-25 3:08PM 1952 1
2 A ABC 2017-10-25 3:00PM 1940 1
3 A ABC 2017-10-25 12:05PM 1920 1
4 A ABC 2017-10-25 9:00AM 1900 1
5 B BCD 2017-10-25 3:11PM 1452 1
6 B BCD 2017-10-25 3:10PM 1440 1
7 B BCD 2017-10-25 12:15PM 1420 1
8 B BCD 2017-10-25 9:30AM 1400 1
9 A ABC 2017-10-23 3:08PM 1900 1
10 A ABC 2017-10-23 3:00PM 1840 1
11 A ABC 2017-10-23 12:05PM 1820 1
12 A ABC 2017-10-23 9:00AM 1800 1
13 B BCD 2017-10-23 3:11PM 1399 1
14 B BCD 2017-10-23 3:10PM 1340 1
15 B BCD 2017-10-23 12:15PM 1320 1
16 B BCD 2017-10-23 9:30AM 1300 1
The counter value increases whenever there is a click. I am trying to calculate number of clicks for each day by taking maximum counter value at the end of day and subtract the previous day maximum counter value and so on.
How do I do this in SQL server. Have to repeat this for each customer and Machine
Try this. I am using LAG function in order to achieve this. You can use where clause to filter out specific date you want :
Create table #counter(ID int, timeStamp datetime, Counter int, type int)
insert into #counter values
(1, '20171024 3:08PM' ,1952, 1),
(1, '20171025 3:00PM' ,1964, 1)
Select iq.*, (iq."counter" - iq.yesterday_counter) as today_count
from
(select id,
cast("timestamp" as date) as today_date,
"counter",
LAG("counter") over (order by cast("timestamp" as date)) yesterday_counter
from #counter
) iq
output:
id today_date counter yesterday_counter today_count
----------- ---------- ----------- ----------------- -----------
1 2017-10-24 1952 NULL NULL
1 2017-10-25 1964 1952 12
A SQL query to get the max counter for each day is:
SELECT CAST(timeStamp as date) AS [dateval]
,MAX(Counter) AS [maxCounter]
FROM YOURDATASET
GROUP BY CAST(timeStamp as date)
This is converting the datetime to date- cutting out the time, then taking the max(Counter).
One method to get the difference is to save the result in a temp datastructure, then query it to get the difference.
The question is whether your previous date is exactly the previous day, or if you're skipping days between counts, or taking the weekend off, etc. In that case you have to select the greatest previous date to the date being examined.
ex.
DECLARE #temp TABLE (dateval date, maxCounter int)
INSERT INTO #temp(dateval, maxCounter)
SELECT CAST(timeStamp as date) AS [dateval]
,MAX(Counter)
FROM YOURDATASET
GROUP BY CAST(timeStamp as date)
SELECT T.dateval
,T.dateval
-
(SELECT maxCounter
FROM #temp T2
WHERE T2.dateVal = (SELECT MAX(dateVal)
FROM #temp T3
WHERE T3.dateVal < T1.dateVal
)
) AS [Difference]
FROM #temp T
ORDER BY T.dateval