I have a table defined as follows
create table test (
m_year varchar2(10),
val number
);
data looks like
myear val
Jan 10 876
Sep 10 709
Jan 11 46
Apr 11 99
Jan 12 878
I want to get output as
01-Jan-10 876
01-Feb-10 876
'
'
'
01-Sep-10 709
'
'
'
sysdate 878
My query looks like
select to_char(add_months(myear,level-1)) months,val
from (
select val, to_date(myear,'mm-yyyy') myear,
lead(to_date(myear,'mm-yyyy'),1,sysdate) over (order by to_date(myear,'mm-yyyy')) as nxt
from test
)
connect by level <= months_between(nxt,myear)+1;
Few months are being missed and I'm getting an infinite loop in the output
Here's one option:
SQL> create table test
2 (m_year varchar2(10),
3 val number);
Table created.
SQL> insert into test
2 select 'Jan 10', 876 from dual union
3 select 'Sep 10', 709 from dual union
4 select 'Jan 11', 46 from dual union
5 select 'Apr 11', 99 from dual union
6 select 'Jan 12', 878 from dual;
5 rows created.
Query:
SQL> with
2 dates as
3 (select to_date(m_year, 'mon rr', 'nls_date_language = english') m_year,
4 val
5 from test
6 ),
7 whole_period as
8 (select add_months(min_year, level - 1) m_year
9 from (select min(m_year) min_year,
10 trunc(sysdate, 'mm') max_year
11 from dates
12 )
13 connect by level <= months_between(max_year, min_year) + 1
14 )
15 select
16 w.m_year,
17 last_value(d.val) ignore nulls over
18 (order by w.m_year rows between unbounded preceding and current row) val
19 from whole_period w left join dates d on d.m_year = w.m_year
20 order by w.m_year;
M_YEAR VAL
---------- ----------
01.01.2010 876
01.02.2010 876
01.03.2010 876
01.04.2010 876
01.05.2010 876
01.06.2010 876
01.07.2010 876
01.08.2010 876
01.09.2010 709
01.10.2010 709
01.11.2010 709
01.12.2010 709
01.01.2011 46
01.02.2011 46
01.03.2011 46
01.04.2011 99
01.05.2011 99
01.06.2011 99
01.07.2011 99
01.08.2011 99
01.09.2011 99
01.10.2011 99
01.11.2011 99
01.12.2011 99
01.01.2012 878
01.02.2012 878
01.03.2012 878
<snip>
01.03.2018 878
01.04.2018 878
01.05.2018 878
101 rows selected.
SQL>
The lazy way of doing the same - easy to read, easy to maintain. The distinct may slow down the performance.
And next time please add your test data and structures to your posts. You will get answers faster.
SELECT DISTINCT add_months(start_date, LEVEL-1) final_date
, start_date
, end_month
FROM
(
SELECT to_date(myear) start_date
, LAG(to_date(myear)) OVER (ORDER BY myear DESC) end_date
, MIN(EXTRACT(MONTH FROM myear)) OVER (ORDER BY myear) start_month
, MAX(EXTRACT(MONTH FROM myear)) OVER (ORDER BY myear DESC) end_month
, ROW_NUMBER() OVER (PARTITION BY EXTRACT(YEAR FROM myear)
ORDER BY EXTRACT(YEAR FROM myear)) rno
FROM
(
SELECT to_date('01-2010', 'mm-yyyy') myear FROM dual
UNION ALL
SELECT to_date('09-2010', 'mm-yyyy') FROM dual
UNION ALL
SELECT to_date('01-2011', 'mm-yyyy') FROM dual
UNION ALL
SELECT to_date('04-2011', 'mm-yyyy') FROM dual
)
)
WHERE rno = 1
CONNECT BY LEVEL <= end_month
ORDER BY start_date
/
FINAL DATE START DATE END MONTH
----------------------------------
01-JAN-10 01-JAN-10 9
01-FEB-10 01-JAN-10 9
01-MAR-10 01-JAN-10 9
01-APR-10 01-JAN-10 9
01-MAY-10 01-JAN-10 9
01-JUN-10 01-JAN-10 9
01-JUL-10 01-JAN-10 9
01-AUG-10 01-JAN-10 9
01-SEP-10 01-JAN-10 9
01-JAN-11 01-JAN-11 4
01-FEB-11 01-JAN-11 4
01-MAR-11 01-JAN-11 4
01-APR-11 01-JAN-11 4
Related
I have a dataset within a date range which has three columns, Product_type, date and metric. For a given product_type, data is not available for all days. For the missing rows, we would like to do a forward date fill for next n days using the last value of the metric.
Product_type
date
metric
A
2019-10-01
10
A
2019-10-02
12
A
2019-10-03
15
A
2019-10-04
5
A
2019-10-05
5
A
2019-10-06
5
A
2019-10-16
12
A
2019-10-17
23
A
2019-10-18
34
Here, the data from 2019-10-04 to 2019-10-06, has been forward filled. There might be bigger gaps in the dates, but we only want to fill the first n days.
Here, n=2, so rows 5 and 6 has been forward filled.
I am not sure how to implement this logic in SQL.
Here's one option. Read comments within code.
Sample data:
SQL> WITH
2 test (product_type, datum, metric)
3 AS
4 (SELECT 'A', DATE '2019-10-01', 10 FROM DUAL
5 UNION ALL
6 SELECT 'A', DATE '2019-10-02', 12 FROM DUAL
7 UNION ALL
8 SELECT 'A', DATE '2019-10-03', 15 FROM DUAL
9 UNION ALL
10 SELECT 'A', DATE '2019-10-04', 5 FROM DUAL
11 UNION ALL
12 SELECT 'A', DATE '2019-10-16', 12 FROM DUAL
13 UNION ALL
14 SELECT 'A', DATE '2019-10-18', 23 FROM DUAL),
Query begins here:
15 temp
16 AS
17 -- CB_FWD_FILL = 1 if difference between two consecutive dates is larger than 1 day
18 -- (i.e. that's the gap to be forward filled)
19 (SELECT product_type,
20 datum,
21 metric,
22 LEAD (datum) OVER (PARTITION BY product_type ORDER BY datum)
23 next_datum,
24 CASE
25 WHEN LEAD (datum)
26 OVER (PARTITION BY product_type ORDER BY datum)
27 - datum >
28 1
29 THEN
30 1
31 ELSE
32 0
33 END
34 cb_fwd_fill
35 FROM test)
36 -- original data from the table
37 SELECT product_type, datum, metric FROM test
38 UNION ALL
39 -- DATUM is the last date which is OK; add LEVEL pseudocolumn to it to fill the gap
40 -- with PAR_N number of rows
41 SELECT product_type, datum + LEVEL, metric
42 FROM (SELECT product_type, datum, metric
43 FROM (-- RN = 1 means that that's the first gap in data set - that's the one
44 -- that has to be forward filled
45 SELECT product_type,
46 datum,
47 metric,
48 ROW_NUMBER ()
49 OVER (PARTITION BY product_type ORDER BY datum) rn
50 FROM temp
51 WHERE cb_fwd_fill = 1)
52 WHERE rn = 1)
53 CONNECT BY LEVEL <= &par_n
54 ORDER BY datum;
Result:
Enter value for par_n: 2
PRODUCT_TYPE DATUM METRIC
--------------- ---------- ----------
A 2019-10-01 10
A 2019-10-02 12
A 2019-10-03 15
A 2019-10-04 5
A 2019-10-05 5 --> newly added
A 2019-10-06 5 --> rows
A 2019-10-16 12
A 2019-10-18 23
8 rows selected.
SQL>
Another solution:
WITH test (product_type, datum, metric) AS
(
SELECT 'A', DATE '2019-10-01', 10 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-02', 12 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-03', 15 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-04', 5 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-16', 12 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-18', 23 FROM DUAL
),
minmax(mindatum, maxdatum) AS (
SELECT MIN(datum), max(datum) from test
),
alldates (datum, product_type) AS
(
SELECT mindatum + level - 1, t.product_type FROM minmax,
(select distinct product_type from test) t
connect by mindatum + level <= (select maxdatum from minmax)
),
grouped as (
select a.datum, a.product_type, t.metric,
count(t.product_type) over(partition by a.product_type order by a.datum) as grp
from alldates a
left join test t on t.datum = a.datum
),
final_table as (
select g.datum, g.product_type, g.grp, g.rn,
last_value(g.metric ignore nulls) over(partition by g.product_type order by g.datum) as metric
from (
select g.*, row_number() over(partition by product_type, grp order by datum) - 1 as rn
from grouped g
) g
)
select datum, product_type, metric
from final_table
where rn <= &par_n
order by datum
;
I'm working with Oracle and I have a table with a column of type TIMESTAMP. I was wondering how can I extract the records from last 4 weeks of activity on the database, partitioned by week.
Following rows are inserted on week 1
kc 2 04-10-2021
vc 3 06-10-2021
vk 4 07-10-2021
Following rows are inserted on week2
cv 1 12-10-2021
ck 5 14-10-2021
Following rows are inserted on week3
vv 7 19-10-2021
Following rows are inserted on week4
vx 7 29-10-2021
Table now has
SQL>select * from tab;
NAME VALUE TIMESTAMP
-------------------- ----------
kc 2 04-10-2021
vc 3 06-10-2021
vk 4 07-10-2021
cv 1 12-10-2021
ck 5 14-10-2021
vv 7 19-10-2021
vx 7 29-10-2021
I would like a query which would give me the number of rows added each week, in the last 4 weeks.
This is what I would like to see
numofrows week
--------- -----
3 1
2 2
1 3
1 4
One option is to use to_char function and its iw parameter:
SQL> with test (name, datum) as
2 (select 'kc', date '2021-10-04' from dual union all
3 select 'vc', date '2021-10-06' from dual union all
4 select 'vk', date '2021-10-07' from dual union all
5 select 'cv', date '2021-10-12' from dual union all
6 select 'ck', date '2021-10-14' from dual union all
7 select 'vv', date '2021-10-19' from dual union all
8 select 'vx', DATE '2021-10-29' from dual
9 )
10 select to_char(datum, 'iw') week,
11 count(*)
12 from test
13 where datum >= add_months(sysdate, -1) --> the last month
14 group by to_char(datum, 'iw');
WE COUNT(*)
-- ----------
42 1
43 1
40 3
41 2
SQL>
Line #13: I intentionally used "one month" instead of "4 weeks" as I thought (maybe wrongly) that you, actually, want that (you know, "a month has 4 weeks" - not exactly, but close, sometimes not close enough).
If you want 4 weeks, what is that, then? Sysdate minus 28 days (as every week has 7 days)? Then you'd modify line #13 to
where datum >= trunc(sysdate - 4*7)
Or, maybe it is really the last 4 weeks:
SQL> with test (name, datum) as
2 (select 'kc', date '2021-10-04' from dual union all
3 select 'vc', date '2021-10-06' from dual union all
4 select 'vk', date '2021-10-07' from dual union all
5 select 'cv', date '2021-10-12' from dual union all
6 select 'ck', date '2021-10-14' from dual union all
7 select 'vv', date '2021-10-19' from dual union all
8 select 'vx', DATE '2021-10-29' from dual
9 ),
10 temp as
11 (select to_char(datum, 'iw') week,
12 count(*) cnt,
13 row_number() over (order by to_char(datum, 'iw') desc) rn
14 from test
15 group by to_char(datum, 'iw')
16 )
17 select week, cnt
18 from temp
19 where rn <= 4
20 order by week;
WE CNT
-- ----------
40 3
41 2
42 1
43 1
SQL>
Now you have several options, see which one fits the best (if any).
I "simulated" missing data (see TEST CTE), created a calendar (calend) and ... did the job. Read comments within code:
SQL> with test (name, datum) as
2 -- sample data
3 (select 'vv', date '2021-10-19' from dual union all
4 select 'vx', DATE '2021-10-29' from dual
5 ),
6 calend as
7 -- the last 31 days; 4 weeks are included, obviously
8 (select max_datum - level + 1 datum
9 from (select max(a.datum) max_datum from test a)
10 connect by level <= 31
11 ),
12 joined as
13 -- joined TEST and CALEND data
14 (select to_char(c.datum, 'iw') week,
15 t.name
16 from calend c left join test t on t.datum = c.datum
17 ),
18 last4 as
19 -- last 4 weeks
20 (select week, count(name) cnt,
21 row_number() over (order by week desc) rn
22 from joined
23 group by week
24 )
25 select week, cnt
26 from last4
27 where rn <= 4
28 order by week;
WE CNT
-- ----------
40 0
41 0
42 1
43 1
SQL>
I'm asking ur help
here this is my set
ID date_answered
---------- --------------
1 16/09/19
2 16/09/19
3 16/09/19
4 16/09/19
5 16/09/19
6 16/09/19
7 16/09/19
8 16/09/19
9 16/09/19
10 17/09/19
11 17/09/19
12 17/09/19
13 18/09/19
14 18/09/19
15 18/09/19
16 18/09/19
17 19/09/19
18 19/09/19
19 19/09/19
20 19/09/19
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
as you can see :
16/09/2019 there are 9 people who answered
17/09/2019 there are 7 people who answered
18/09/2019 there are 4 people who answered
19/09/2019 there are 4 people who answered
there are still 20 people who didnt answer
to calculate how many people answered per day, i have done :
nb_answered = count(id) over (partition by date_answered order by date_answered)
now my problem is there, i'm trying to get that :
date_answered nb_answered nb_left
--------------- -------------- --------
16/09/2019 9 40
17/09/2019 7 31(40-9)
18/09/2019 4 24(31-7)
19/09/2019 4 20(24-4)
i have tried :
count(id) over (order by date_complete rows between UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) which give me 40 (total person).
it's cool for the first date, but when i move to the second date i dont know how to have 31.
How can I do that: every day I remove from the total, the number that has already answered
Do you have any suggestion ?
Another option might be correlated subquery in SELECT statement.
Example is a little bit simplified (didn't feel like typing that much).
SQL> with test (id, da) as
2 (select 1, 16092019 from dual union all
3 select 2, 16092019 from dual union all
4 select 3, 16092019 from dual union all
5 select 4, 16092019 from dual union all
6 select 5, 16092019 from dual union all
7 --
8 select 6, 17092019 from dual union all
9 select 7, 17092019 from dual union all
10 select 8, 17092019 from dual union all
11 --
12 select 9, 19092019 from dual union all
13 --
14 select 10, null from dual union all
15 select 11, null from dual union all
16 select 12, null from dual union all
17 select 13, null from dual
18 )
19 select a.da date_answered,
20 count(a.id) nb_answered,
21 (select count(*) from test b
22 where b.da >= a.da
23 or b.da is null
24 ) nb_left
25 from test a
26 group by a.da
27 order by a.da;
DATE_ANSWERED NB_ANSWERED NB_LEFT
------------- ----------- ----------
16092019 5 13
17092019 3 8
19092019 1 5
4 4
SQL>
You want to subtract the overall count from the cumulative count:
select date_answered, count(*) as answered_on_date,
( count(*) over () -
sum(count(*)) over (order by date_answered nulls last)
) as remaining
from t
group by date_answered
order by date_answered;
If you don't want to include the current date, then subtract that as well:
select date_answered, count(*) as answered_on_date,
( count(*) over () -
sum(count(*)) over (order by date_answered nulls last) -
count(*)
) as remaining
from t
group by date_answered
order by date_answered;
I need some help in finding the number of reliefs each teacher has, every single day, 2 months before the teacher resigns.
Join_dt - teacher's join date,
Resign_dt - teacher's resign date,
Relief_ID - Relief teacher's ID,
Start_dt - Relief's start date,
End_dt - Relief's end date,
note that there may be overlapping dates between 2 or more different reliefs and so I need to find the number of distinct reliefs each teacher has for each date.
This is what I am given:
Teacher_ID Join_dt Resign_dt Relief_ID Start_dt End_dt
12 2006-08-30 2019-08-01 20 2017-02-07 2019-07-04
12 2006-08-30 2019-08-01 20 2016-11-10 2019-01-30
12 2006-08-30 2019-08-01 103 2016-08-20 2019-07-29
12 2006-08-30 2019-08-01 17 2016-01-30 2017-12-30
23 2017-10-01 2018-11-12 44 2018-10-19 2018-11-11
23 2017-10-01 2018-11-12 29 2018-04-01 2018-12-02
23 2017-10-01 2018-11-12 06 2017-11-25 2018-05-02
05 2015-02-11 2019-10-02 38 2019-01-17 2019-07-21
05 2015-02-11 2019-10-02 11 2018-11-02 2019-02-05
05 2015-02-11 2019-10-02 15 2018-09-30 2018-10-03
Expected result:
Teacher_ID Dates No_of_reliefs
12 2019-07-31 0
12 2019-07-30 0
12 2019-07-29 1
12 2019-07-28 1
12 2019-07-27 1
... ...
12 2019-07-04 2
... ...
12 2016-05-30 2
12 2016-05-29 2
12 2016-05-28 2
12 2016-05-27 2
12 2016-05-26 1
23 2018-10-31 2
... ...
For date 2019-07-29, No_of_reliefs = 1 because of Relief_ID 103.
For date 2017-07-04, No_of_reliefs = 2 because of Relief_ID 20 & 103.
Dates are supposed to start from 1 month before the teacher resigned. For Teacher_ID 23, since she resigned on 2019-11-12, dates shall start from 2019-10-31.
I have tried using connect by but the execution time is really long since it involves a large amount of data.
Any other methods will be greatly appreciated!!
Thank you kind souls!!!
You can use
connect by level <= last_day(add_months(Resign_dt,-1)) - add_months(Resign_dt,-2) clause :
I suppose you mean 2 months before resignment for the starting date, and ending on the last day of the previous month.
with t1(Teacher_ID,Resign_dt,Relief_ID,start_dt,end_dt) as
(
select 12,date'2019-08-01',20 ,date'2017-02-07',date'2019-07-04' from dual union all
select 12,date'2019-08-01',20 ,date'2016-11-10',date'2019-01-30' from dual union all
select 12,date'2019-08-01',103,date'2016-08-20',date'2019-07-29' from dual
......
), t2 as
(
select distinct last_day(add_months(Resign_dt,-1)) - level + 1 as Resign_dt, Teacher_ID
from t1
connect by level <= last_day(add_months(Resign_dt,-1)) - add_months(Resign_dt,-2)
and prior Teacher_ID = Teacher_ID and prior sys_guid() is not null
)
select Teacher_ID, to_char(Resign_dt,'yyyy-mm-dd') as Dates,
(select count(distinct Relief_ID)
from t1
where t2.Resign_dt between start_dt and end_dt
and t2.Teacher_ID = Teacher_ID
)
from t2
order by Teacher_ID, Resign_dt desc;
Demo
select d.dt
, tr.Teacher_ID
--, tr.Join_dt
--, tr.Resign_dt
, count(tr.Relief_ID)
--, tr.Start_dt
--, tr.End_dt
from tr
right outer join (
SELECT dt
FROM (
SELECT DATE '2006-01-01' + ROWNUM - 1 dt
FROM DUAL CONNECT BY ROWNUM < 5000
) q
WHERE EXTRACT(YEAR FROM dt) < EXTRACT(YEAR FROM sysdate) + 2
--order by 1
) d on d.dt between tr.Join_dt and tr.End_dt
and d.dt between tr.Start_dt and tr.Resign_dt
group by d.dt
, tr.Teacher_ID
order by d.dt desc
I am trying to use SQL to select distinct data entries based on the time difference between one entry and the next. It's easier to explain with an example:
My data table has
Part DateTime
123 12:00:00
123 12:00:05
123 12:00:06
456 12:10:23
789 12:12:13
123 12:14:32
I would like to return all rows as long with the limitation that if there are multiple entries with the same "Part" number I would like to retrieve only those that have a difference of at least 5 minutes.
The query should return:
Part DateTime
123 12:00:00
456 12:10:23
789 12:12:13
123 12:14:32
The code I'm using is the following:
SELECT data1.*, to_char(data1.scan_time, 'yyyymmdd hh24:mi:ss')
FROM data data1
where exists
(
select *
from data data2
where data1.part_serial_number = data2.part_serial_number AND
data2.scan_time + 5/1440 >= data1.scan_time
and data2.info is null
)
order by to_char(data1.scan_time, 'yyyymmdd hh24:mi:ss'), data1.part_serial_number
This is not working unfortunately. Does anyone know what i'm doing wrong or can suggest an alternate approach??
Thanks
Analytic functions to the rescue.
You can use the analytic function LEAD to get the data for the next row for the part.
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 123 part, timestamp '2011-12-08 00:00:00' ts
3 from dual
4 union all
5 select 123, timestamp '2011-12-08 00:00:05'
6 from dual
7 union all
8 select 123, timestamp '2011-12-08 00:00:06'
9 from dual
10 union all
11 select 456, timestamp '2011-12-08 00:10:23'
12 from dual
13 union all
14 select 789, timestamp '2011-12-08 00:12:13'
15 from dual
16 union all
17 select 123, timestamp '2011-12-08 00:14:32'
18 from dual
19 )
20 select part,
21 ts,
22 lead(ts) over (partition by part order by ts) next_ts
23* from x
SQL> /
PART TS NEXT_TS
---------- ------------------------------- -------------------------------
123 08-DEC-11 12.00.00.000000000 AM 08-DEC-11 12.00.05.000000000 AM
123 08-DEC-11 12.00.05.000000000 AM 08-DEC-11 12.00.06.000000000 AM
123 08-DEC-11 12.00.06.000000000 AM 08-DEC-11 12.14.32.000000000 AM
123 08-DEC-11 12.14.32.000000000 AM
456 08-DEC-11 12.10.23.000000000 AM
789 08-DEC-11 12.12.13.000000000 AM
6 rows selected.
Once you've done that, then you can create an inline view and simply select those rows where the next date is more than 5 minutes after the current date.
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 123 part, timestamp '2011-12-08 00:00:00' ts
3 from dual
4 union all
5 select 123, timestamp '2011-12-08 00:00:05'
6 from dual
7 union all
8 select 123, timestamp '2011-12-08 00:00:06'
9 from dual
10 union all
11 select 456, timestamp '2011-12-08 00:10:23'
12 from dual
13 union all
14 select 789, timestamp '2011-12-08 00:12:13'
15 from dual
16 union all
17 select 123, timestamp '2011-12-08 00:14:32'
18 from dual
19 )
20 select part,
21 ts
22 from (
23 select part,
24 ts,
25 lead(ts) over (partition by part order by ts) next_ts
26 from x )
27 where next_ts is null
28* or next_ts > ts + interval '5' minute
SQL> /
PART TS
---------- -------------------------------
123 08-DEC-11 12.00.06.000000000 AM
123 08-DEC-11 12.14.32.000000000 AM
456 08-DEC-11 12.10.23.000000000 AM
789 08-DEC-11 12.12.13.000000000 AM
AFJ,
let's supose that we have a new field that tell us if exists a previus entry for this Part in the previous 5 minutes, then, taking the rows that this field is set to False we have the result.
select
Part,
DateTime,
coalesce(
(select distinct 1
from data ds
where ds.Part = d.Part
and ds.DateTime between d.DateTime and d.DateTime - 5/1440
)
, 0) as exists_previous
from data d
The subquery checks if they are a row with same Part in previous 5 minutes inteval
Result must be:
Part DateTime exists_previous
123 12:00:00 0
123 12:00:05 1
123 12:00:06 1
456 12:10:23 0
789 12:12:13 0
123 12:14:32 0
now, filter to get only rows with 0:
select Part, DateTime from
(select
Part,
DateTime,
coalesce(
(select distinct 1
from data ds
where ds.Part = d.Part
and ds.DateTime between d.DateTime and d.DateTime - 5/1440
)
, 0) as exists_previous
from data D
) T where T.exists_previous = 0
Disclaimer: not tested.
This has not been verified, but essentially, the trick is to group by part AND time divided by 5 minutes (floored).
select part, min(scan_time)
from data
group by part, floor(scan_time/(5/1440))
order by scan_time;