I have the following table showing when customers bought a certain product. The data I have is CustomerID, Amount, Dat. I am trying to create the column ProductsIn30Days, which represents how many products a customer bought in the range Dat-30 days inclusive the current day.
For example, ProductsIn30Days for CustomerID 1 on Dat 25.3.2020 is 7, since the customer bought 2 products on 25.3.2020 and 5 more products on 24.3.2020, which falls within 30 days before 25.3.2020.
CustomerID
Amount
Dat
ProductsIn30Days
1
1
23.3.2018
1
1
2
24.3.2020
2
1
3
24.3.2020
5
1
2
25.3.2020
7
1
2
24.5.2020
2
1
1
15.6.2020
3
2
7
24.3.2017
7
2
2
24.3.2020
2
I tried something like this with no success, since the partition only works on a single date rather than on a range like I would need:
select CustomerID, Amount, Dat,
sum(Amount) over (partition by CustomerID, Dat-30)
from table
Thank you for help.
You can use an analytic SUM function with a range window:
SELECT t.*,
SUM(Amount) OVER (
PARTITION BY CustomerID
ORDER BY Dat
RANGE BETWEEN INTERVAL '30' DAY PRECEDING AND CURRENT ROW
) AS ProductsIn30Days
FROM table_name t;
Which, for the sample data:
CREATE TABLE table_name (CustomerID, Amount, Dat) AS
SELECT 1, 1, DATE '2018-03-23' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-03-24' FROM DUAL UNION ALL
SELECT 1, 3, DATE '2020-03-24' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-03-25' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-05-24' FROM DUAL UNION ALL
SELECT 1, 1, DATE '2020-06-15' FROM DUAL UNION ALL
SELECT 2, 7, DATE '2017-03-24' FROM DUAL UNION ALL
SELECT 2, 2, DATE '2020-03-24' FROM DUAL;
Outputs:
CUSTOMERID
AMOUNT
DAT
PRODUCTSIN30DAYS
1
1
2018-03-23 00:00:00
1
1
2
2020-03-24 00:00:00
5
1
3
2020-03-24 00:00:00
5
1
2
2020-03-25 00:00:00
7
1
2
2020-05-24 00:00:00
2
1
1
2020-06-15 00:00:00
3
2
7
2017-03-24 00:00:00
7
2
2
2020-03-24 00:00:00
2
Note: If you have values on the same date then they will be tied in the order and always aggregated together (i.e. rows 2 & 3). If you want them to be aggregated separately then you need to order by something else to break the ties but that would not work with a RANGE window.
db<>fiddle here
Related
This question already has answers here:
Second highest grade for each student
(3 answers)
Closed 3 months ago.
My data looks like below:
table
REF_NUM ID DATE
SIM1 1 12-Oct-22
SIM1 2 10-Oct-22
SIM2 3 15-Oct-22
SIM2 4 14-Oct-22
SIM3 5 08-Oct-22
SIM3 6 02-Oct-22
SIM4 7 08-Oct-22
SIM4 8 10-Oct-22
Output should be as below:
Output:
REF_NUM ID DATE
SIM1 2 10-Oct-22
SIM2 4 14-Oct-22
SIM3 6 02-Oct-22
SIM4 7 08-Oct-22
basically I need data with distinct ref_num , respective ID and with SECOND HIGHEST DATE. Here I have just given two dates in main table, But each ref_num can have more than two dates.
I can sure that whatever I have tried is wrong
Rank rows per each ref_num by the date datatype value in descending order; then fetch these that rank as the second highest.
Sample data:
SQL> with test (ref_num, id, datum) as
2 (select 'sim1', 1, date '2022-10-12' from dual union all
3 select 'sim1', 2, date '2022-10-10' from dual union all
4 select 'sim2', 3, date '2022-10-15' from dual union all
5 select 'sim2', 4, date '2022-10-14' from dual union all
6 select 'sim3', 5, date '2022-10-08' from dual union all
7 select 'sim3', 6, date '2022-10-02' from dual union all
8 select 'sim4', 7, date '2022-10-08' from dual union all
9 select 'sim4', 8, date '2022-10-10' from dual
10 ),
Query begins here:
11 temp as
12 (select ref_num, id, datum,
13 rank() over (partition by ref_num order by datum desc) rnk
14 from test
15 )
16 select ref_num, id, datum
17 from temp
18 where rnk = 2
19 order by ref_num;
REF_ ID DATUM
---- ---------- ----------
sim1 2 10.10.2022
sim2 4 14.10.2022
sim3 6 02.10.2022
sim4 7 08.10.2022
SQL>
I have this table:
Site_ID
Volume
RPT_Date
RPT_Hour
1
10
01/01/2021
1
1
7
01/01/2021
2
1
13
01/01/2021
3
1
11
01/16/2021
1
1
3
01/16/2021
2
1
5
01/16/2021
3
2
9
01/01/2021
1
2
24
01/01/2021
2
2
16
01/01/2021
3
2
18
01/16/2021
1
2
7
01/16/2021
2
2
1
01/16/2021
3
I need to select the RPT_Hour with the highest Volume for each set of dates
Needed Output:
Site_ID
Volume
RPT_Date
RPT_Hour
1
13
01/01/2021
1
1
11
01/16/2021
1
2
24
01/01/2021
2
2
18
01/16/2021
1
SELECT site_id, volume, rpt_date, rpt_hour
FROM (SELECT t.*,
ROW_NUMBER()
OVER (PARTITION BY site_id, rpt_date ORDER BY volume DESC) AS rn
FROM MyTable) t
WHERE rn = 1;
I cannot figure out how to group the table into like date groups. If I could do that, I think the rn = 1 will return the highest volume row for each date.
The way I see it, your query is OK (but rpt_hour in desired output is not).
SQL> with test (site_id, volume, rpt_date, rpt_hour) as
2 (select 1, 10, date '2021-01-01', 1 from dual union all
3 select 1, 7, date '2021-01-01', 2 from dual union all
4 select 1, 13, date '2021-01-01', 3 from dual union all
5 select 1, 11, date '2021-01-16', 1 from dual union all
6 select 1, 3, date '2021-01-16', 2 from dual union all
7 select 1, 5, date '2021-01-16', 3 from dual union all
8 --
9 select 2, 9, date '2021-01-01', 1 from dual union all
10 select 2, 24, date '2021-01-01', 3 from dual union all
11 select 2, 16, date '2021-01-01', 3 from dual union all
12 select 2, 18, date '2021-01-16', 1 from dual union all
13 select 2, 7, date '2021-01-16', 2 from dual union all
14 select 2, 1, date '2021-01-16', 3 from dual
15 ),
16 temp as
17 (select t.*,
18 row_number() over (partition by site_id, rpt_date order by volume desc) rn
19 from test t
20 )
21 select site_id, volume, rpt_date, rpt_hour
22 from temp
23 where rn = 1
24 /
SITE_ID VOLUME RPT_DATE RPT_HOUR
---------- ---------- ---------- ----------
1 13 01/01/2021 3
1 11 01/16/2021 1
2 24 01/01/2021 3
2 18 01/16/2021 1
SQL>
One option would be using MAX(..) KEEP (DENSE_RANK ..) OVER (PARTITION BY ..) analytic function without need of any subquery such as :
SELECT DISTINCT
site_id,
MAX(volume) KEEP (DENSE_RANK FIRST ORDER BY volume DESC) OVER
(PARTITION BY site_id, rpt_date) AS volume,
rpt_date,
MAX(rpt_hour) KEEP (DENSE_RANK FIRST ORDER BY volume DESC) OVER
(PARTITION BY site_id, rpt_date) AS rpt_hour
FROM t
GROUP BY site_id, rpt_date, volume, rpt_hour
ORDER BY site_id, rpt_date
Demo
There is a source table which loads the data full and monthly. The table looks like below example.
Source table:
pk
code_paym
code_terms
etl_id
1
2
3
2020-08-01
1
2
3
2020-09-01
1
2
4
2020-10-01
1
2
4
2020-11-01
1
2
4
2020-12-01
1
2
4
2021-01-01
1
2
3
2021-02-01
1
2
3
2021-03-01
1
2
3
2021-04-01
1
2
3
2021-05-01
I would like to create valid_from valid_to columns from the source table like below example.
Desired Output:
pk
code_paym
code_terms
valid_from
valid_to
1
2
3
2020-08-01
2020-09-01
1
2
4
2020-10-01
2021-01-01
1
2
3
2021-02-01
2021-05-01
As it can be seen attributes can go back to the same values by the time.
How can I make this output happen by sql code?
Thank you very much,
Regards
Using CONDITIONAL_TRUE_EVENT windowed function to determine continuous subgroups:
CREATE OR REPLACE TABLE t( pk INT, code_paym INT, code_terms INT, etl_id DATE)
AS
SELECT 1, 2, 3, '2020-08-01'
UNION ALL SELECT 1, 2, 3, '2020-09-01'
UNION ALL SELECT 1, 2, 4, '2020-10-01'
UNION ALL SELECT 1, 2, 4, '2020-11-01'
UNION ALL SELECT 1, 2, 4, '2020-12-01'
UNION ALL SELECT 1, 2, 4, '2021-01-01'
UNION ALL SELECT 1, 2, 3, '2021-02-01'
UNION ALL SELECT 1, 2, 3, '2021-03-01'
UNION ALL SELECT 1, 2, 3, '2021-04-01'
UNION ALL SELECT 1, 2, 3, '2021-05-01';
Query:
WITH cte AS (
SELECT t.*,
CONDITIONAL_TRUE_EVENT(CODE_TERMS != LAG(CODE_TERMS,1,CODE_TERMS)
OVER(PARTITION BY PK, CODE_PAYM ORDER BY ETL_ID))
OVER(PARTITION BY PK, CODE_PAYM ORDER BY ETL_ID) AS grp
FROM t
)
SELECT PK, CODE_PAYM, grp, MIN(ETL_ID) AS valid_from, MAX(ETL_ID) AS valid_to
FROM cte
GROUP BY PK, CODE_PAYM, grp;
Output:
I have the tables below and I need my query to bring me the amount of operations grouped by date.
For the dates on which there will be no operations, I need to return the date anyway with the zero count.
Kind like that:
OPERATION_DATE | COUNT_OPERATION | COUNT_OPERATION2 |
04/06/2019 | 453 | 81 |
05/06/2019 | 0 | 0 |
-- QUERY I TRIED
SELECT
T1.DATE_OPERATION AS DATE_OPERATION,
NVL(T1.COUNT_OPERATION, '0') COUNT_OPERATION,
NVL(T1.COUNT_OPERATION2, '0') COUNT_OPERATIONX,
FROM
(
SELECT
trunc(t.DATE_OPERATION) as DATE_OPERATION,
count(t.ID_OPERATION) AS COUNT_OPERATION,
COUNT(CASE WHEN O.OPERATION_TYPE = 'X' THEN 1 END) COUNT_OPERATIONX,
from OPERATION o
left join OPERATION_TYPE ot on ot.id_operation = o.id_operation
where ot.OPERATION_TYPE in ('X', 'W', 'Z', 'I', 'J', 'V')
and TRUNC(t.DATE_OPERATION) >= to_date('01/06/2019', 'DD-MM-YYYY')
group by trunc(t.DATE_OPERATION)
) T1
-- TABLES
CREATE TABLE OPERATION
( ID_OPERATION NUMBER NOT NULL,
DATE_OPERATION DATE NOT NULL,
VALUE NUMBER NOT NULL )
CREATE TABLE OPERATION_TYPE
( ID_OPERATION NUMBER NOT NULL,
OPERATION_TYPE VARCHAR2(1) NOT NULL,
VALUE NUMBER NOT NULL)
I guess that it is a calendar you need, i.e. a table which contains all dates involved. Otherwise, how can you display something that doesn't exist?
This is what you currently have (I'm using only the operation table; add another one yourself):
SQL> with
2 operation (id_operation, date_operation, value) as
3 (select 1, date '2019-06-01', 100 from dual union all
4 select 2, date '2019-06-01', 200 from dual union all
5 -- 02/06/2019 is missing
6 select 3, date '2019-06-03', 300 from dual union all
7 select 4, date '2019-06-04', 400 from dual
8 )
9 select o.date_operation,
10 count(o.id_operation)
11 from operation o
12 group by o.date_operation
13 order by o.date_operation;
DATE_OPERA COUNT(O.ID_OPERATION)
---------- ---------------------
01/06/2019 2
03/06/2019 1
04/06/2019 1
SQL>
As there are no rows that belong to 02/06/2019, query can't return anything (you already know that).
Therefore, add a calendar. If you already have that table, fine - use it. If not, create one. It is a hierarchical query which adds level to a certain date. I'm using 01/06/2019 as the starting point, creating 5 days (note the connect by clause).
SQL> with
2 operation (id_operation, date_operation, value) as
3 (select 1, date '2019-06-01', 100 from dual union all
4 select 2, date '2019-06-01', 200 from dual union all
5 -- 02/06/2019 is missing
6 select 3, date '2019-06-03', 300 from dual union all
7 select 4, date '2019-06-04', 400 from dual
8 ),
9 dates (datum) as --> this is a calendar
10 (select date '2019-06-01' + level - 1
11 from dual
12 connect by level <= 5
13 )
14 select d.datum,
15 count(o.id_operation)
16 from operation o full outer join dates d on d.datum = o.date_operation
17 group by d.datum
18 order by d.datum;
DATUM COUNT(O.ID_OPERATION)
---------- ---------------------
01/06/2019 2
02/06/2019 0 --> missing in source table
03/06/2019 1
04/06/2019 1
05/06/2019 0 --> missing in source table
SQL>
Probably a better option is to dynamically create a calendar so that it doesn't depend on any hardcoded values, but uses the min(date_operation) to max(date_operation) time span. Here we go:
SQL> with
2 operation (id_operation, date_operation, value) as
3 (select 1, date '2019-06-01', 100 from dual union all
4 select 2, date '2019-06-01', 200 from dual union all
5 -- 02/06/2019 is missing
6 select 3, date '2019-06-03', 300 from dual union all
7 select 4, date '2019-06-04', 400 from dual
8 ),
9 dates (datum) as --> this is a calendar
10 (select x.min_datum + level - 1
11 from (select min(o.date_operation) min_datum,
12 max(o.date_operation) max_datum
13 from operation o
14 ) x
15 connect by level <= x.max_datum - x.min_datum + 1
16 )
17 select d.datum,
18 count(o.id_operation)
19 from operation o full outer join dates d on d.datum = o.date_operation
20 group by d.datum
21 order by d.datum;
DATUM COUNT(O.ID_OPERATION)
---------- ---------------------
01/06/2019 2
02/06/2019 0 --> missing in source table
03/06/2019 1
04/06/2019 1
SQL>
I am attempting to write Oracle SQL.
I am looking for solution something similar. Please find below data I have
start_date end_date customer
01-01-2012 31-06-2012 a
01-01-2012 31-01-2012 b
01-02-2012 31-03-2012 c
I want the count of customer in that date period. My result should look like below
Month : Customer Count
JAN-12 : 2
FEB-12 : 2
MAR-12 : 2
APR-12 : 1
MAY-12 : 1
JUN-12 : 1
One option would be to generate the months separately in another query and join that to your data table (note that I'm assuming that you intended customer A to have an end-date of June 30, 2012 since there is no June 31).
SQL> ed
Wrote file afiedt.buf
1 with mnths as(
2 select add_months( date '2012-01-01', level - 1 ) mnth
3 from dual
4 connect by level <= 6 ),
5 data as (
6 select date '2012-01-01' start_date, date '2012-06-30' end_date, 'a' customer from dual union all
7 select date '2012-01-01', date '2012-01-31', 'b' from dual union all
8 select date '2012-02-01', date '2012-03-31', 'c' from dual
9 )
10 select mnths.mnth, count(*)
11 from data,
12 mnths
13 where mnths.mnth between data.start_date and data.end_date
14 group by mnths.mnth
15* order by mnths.mnth
SQL> /
MNTH COUNT(*)
--------- ----------
01-JAN-12 2
01-FEB-12 2
01-MAR-12 2
01-APR-12 1
01-MAY-12 1
01-JUN-12 1
6 rows selected.
WITH TMP(monthyear,start_date,end_date,customer) AS (
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from data
union all
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from TMP
where LAST_DAY(end_date) >= LAST_DAY(start_date)
)
SELECT TO_CHAR(MonthYear, 'MON-YY') TheMonth,
Count(Customer) Customers
FROM TMP
GROUP BY MonthYear
ORDER BY MonthYear;
SQLFiddle