I am trying to use SQL to select distinct data entries based on the time difference between one entry and the next. It's easier to explain with an example:
My data table has
Part DateTime
123 12:00:00
123 12:00:05
123 12:00:06
456 12:10:23
789 12:12:13
123 12:14:32
I would like to return all rows as long with the limitation that if there are multiple entries with the same "Part" number I would like to retrieve only those that have a difference of at least 5 minutes.
The query should return:
Part DateTime
123 12:00:00
456 12:10:23
789 12:12:13
123 12:14:32
The code I'm using is the following:
SELECT data1.*, to_char(data1.scan_time, 'yyyymmdd hh24:mi:ss')
FROM data data1
where exists
(
select *
from data data2
where data1.part_serial_number = data2.part_serial_number AND
data2.scan_time + 5/1440 >= data1.scan_time
and data2.info is null
)
order by to_char(data1.scan_time, 'yyyymmdd hh24:mi:ss'), data1.part_serial_number
This is not working unfortunately. Does anyone know what i'm doing wrong or can suggest an alternate approach??
Thanks
Analytic functions to the rescue.
You can use the analytic function LEAD to get the data for the next row for the part.
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 123 part, timestamp '2011-12-08 00:00:00' ts
3 from dual
4 union all
5 select 123, timestamp '2011-12-08 00:00:05'
6 from dual
7 union all
8 select 123, timestamp '2011-12-08 00:00:06'
9 from dual
10 union all
11 select 456, timestamp '2011-12-08 00:10:23'
12 from dual
13 union all
14 select 789, timestamp '2011-12-08 00:12:13'
15 from dual
16 union all
17 select 123, timestamp '2011-12-08 00:14:32'
18 from dual
19 )
20 select part,
21 ts,
22 lead(ts) over (partition by part order by ts) next_ts
23* from x
SQL> /
PART TS NEXT_TS
---------- ------------------------------- -------------------------------
123 08-DEC-11 12.00.00.000000000 AM 08-DEC-11 12.00.05.000000000 AM
123 08-DEC-11 12.00.05.000000000 AM 08-DEC-11 12.00.06.000000000 AM
123 08-DEC-11 12.00.06.000000000 AM 08-DEC-11 12.14.32.000000000 AM
123 08-DEC-11 12.14.32.000000000 AM
456 08-DEC-11 12.10.23.000000000 AM
789 08-DEC-11 12.12.13.000000000 AM
6 rows selected.
Once you've done that, then you can create an inline view and simply select those rows where the next date is more than 5 minutes after the current date.
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 123 part, timestamp '2011-12-08 00:00:00' ts
3 from dual
4 union all
5 select 123, timestamp '2011-12-08 00:00:05'
6 from dual
7 union all
8 select 123, timestamp '2011-12-08 00:00:06'
9 from dual
10 union all
11 select 456, timestamp '2011-12-08 00:10:23'
12 from dual
13 union all
14 select 789, timestamp '2011-12-08 00:12:13'
15 from dual
16 union all
17 select 123, timestamp '2011-12-08 00:14:32'
18 from dual
19 )
20 select part,
21 ts
22 from (
23 select part,
24 ts,
25 lead(ts) over (partition by part order by ts) next_ts
26 from x )
27 where next_ts is null
28* or next_ts > ts + interval '5' minute
SQL> /
PART TS
---------- -------------------------------
123 08-DEC-11 12.00.06.000000000 AM
123 08-DEC-11 12.14.32.000000000 AM
456 08-DEC-11 12.10.23.000000000 AM
789 08-DEC-11 12.12.13.000000000 AM
AFJ,
let's supose that we have a new field that tell us if exists a previus entry for this Part in the previous 5 minutes, then, taking the rows that this field is set to False we have the result.
select
Part,
DateTime,
coalesce(
(select distinct 1
from data ds
where ds.Part = d.Part
and ds.DateTime between d.DateTime and d.DateTime - 5/1440
)
, 0) as exists_previous
from data d
The subquery checks if they are a row with same Part in previous 5 minutes inteval
Result must be:
Part DateTime exists_previous
123 12:00:00 0
123 12:00:05 1
123 12:00:06 1
456 12:10:23 0
789 12:12:13 0
123 12:14:32 0
now, filter to get only rows with 0:
select Part, DateTime from
(select
Part,
DateTime,
coalesce(
(select distinct 1
from data ds
where ds.Part = d.Part
and ds.DateTime between d.DateTime and d.DateTime - 5/1440
)
, 0) as exists_previous
from data D
) T where T.exists_previous = 0
Disclaimer: not tested.
This has not been verified, but essentially, the trick is to group by part AND time divided by 5 minutes (floored).
select part, min(scan_time)
from data
group by part, floor(scan_time/(5/1440))
order by scan_time;
Related
I have a dataset within a date range which has three columns, Product_type, date and metric. For a given product_type, data is not available for all days. For the missing rows, we would like to do a forward date fill for next n days using the last value of the metric.
Product_type
date
metric
A
2019-10-01
10
A
2019-10-02
12
A
2019-10-03
15
A
2019-10-04
5
A
2019-10-05
5
A
2019-10-06
5
A
2019-10-16
12
A
2019-10-17
23
A
2019-10-18
34
Here, the data from 2019-10-04 to 2019-10-06, has been forward filled. There might be bigger gaps in the dates, but we only want to fill the first n days.
Here, n=2, so rows 5 and 6 has been forward filled.
I am not sure how to implement this logic in SQL.
Here's one option. Read comments within code.
Sample data:
SQL> WITH
2 test (product_type, datum, metric)
3 AS
4 (SELECT 'A', DATE '2019-10-01', 10 FROM DUAL
5 UNION ALL
6 SELECT 'A', DATE '2019-10-02', 12 FROM DUAL
7 UNION ALL
8 SELECT 'A', DATE '2019-10-03', 15 FROM DUAL
9 UNION ALL
10 SELECT 'A', DATE '2019-10-04', 5 FROM DUAL
11 UNION ALL
12 SELECT 'A', DATE '2019-10-16', 12 FROM DUAL
13 UNION ALL
14 SELECT 'A', DATE '2019-10-18', 23 FROM DUAL),
Query begins here:
15 temp
16 AS
17 -- CB_FWD_FILL = 1 if difference between two consecutive dates is larger than 1 day
18 -- (i.e. that's the gap to be forward filled)
19 (SELECT product_type,
20 datum,
21 metric,
22 LEAD (datum) OVER (PARTITION BY product_type ORDER BY datum)
23 next_datum,
24 CASE
25 WHEN LEAD (datum)
26 OVER (PARTITION BY product_type ORDER BY datum)
27 - datum >
28 1
29 THEN
30 1
31 ELSE
32 0
33 END
34 cb_fwd_fill
35 FROM test)
36 -- original data from the table
37 SELECT product_type, datum, metric FROM test
38 UNION ALL
39 -- DATUM is the last date which is OK; add LEVEL pseudocolumn to it to fill the gap
40 -- with PAR_N number of rows
41 SELECT product_type, datum + LEVEL, metric
42 FROM (SELECT product_type, datum, metric
43 FROM (-- RN = 1 means that that's the first gap in data set - that's the one
44 -- that has to be forward filled
45 SELECT product_type,
46 datum,
47 metric,
48 ROW_NUMBER ()
49 OVER (PARTITION BY product_type ORDER BY datum) rn
50 FROM temp
51 WHERE cb_fwd_fill = 1)
52 WHERE rn = 1)
53 CONNECT BY LEVEL <= &par_n
54 ORDER BY datum;
Result:
Enter value for par_n: 2
PRODUCT_TYPE DATUM METRIC
--------------- ---------- ----------
A 2019-10-01 10
A 2019-10-02 12
A 2019-10-03 15
A 2019-10-04 5
A 2019-10-05 5 --> newly added
A 2019-10-06 5 --> rows
A 2019-10-16 12
A 2019-10-18 23
8 rows selected.
SQL>
Another solution:
WITH test (product_type, datum, metric) AS
(
SELECT 'A', DATE '2019-10-01', 10 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-02', 12 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-03', 15 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-04', 5 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-16', 12 FROM DUAL
UNION ALL
SELECT 'A', DATE '2019-10-18', 23 FROM DUAL
),
minmax(mindatum, maxdatum) AS (
SELECT MIN(datum), max(datum) from test
),
alldates (datum, product_type) AS
(
SELECT mindatum + level - 1, t.product_type FROM minmax,
(select distinct product_type from test) t
connect by mindatum + level <= (select maxdatum from minmax)
),
grouped as (
select a.datum, a.product_type, t.metric,
count(t.product_type) over(partition by a.product_type order by a.datum) as grp
from alldates a
left join test t on t.datum = a.datum
),
final_table as (
select g.datum, g.product_type, g.grp, g.rn,
last_value(g.metric ignore nulls) over(partition by g.product_type order by g.datum) as metric
from (
select g.*, row_number() over(partition by product_type, grp order by datum) - 1 as rn
from grouped g
) g
)
select datum, product_type, metric
from final_table
where rn <= &par_n
order by datum
;
I'm working with Oracle and I have a table with a column of type TIMESTAMP. I was wondering how can I extract the records from last 4 weeks of activity on the database, partitioned by week.
Following rows are inserted on week 1
kc 2 04-10-2021
vc 3 06-10-2021
vk 4 07-10-2021
Following rows are inserted on week2
cv 1 12-10-2021
ck 5 14-10-2021
Following rows are inserted on week3
vv 7 19-10-2021
Following rows are inserted on week4
vx 7 29-10-2021
Table now has
SQL>select * from tab;
NAME VALUE TIMESTAMP
-------------------- ----------
kc 2 04-10-2021
vc 3 06-10-2021
vk 4 07-10-2021
cv 1 12-10-2021
ck 5 14-10-2021
vv 7 19-10-2021
vx 7 29-10-2021
I would like a query which would give me the number of rows added each week, in the last 4 weeks.
This is what I would like to see
numofrows week
--------- -----
3 1
2 2
1 3
1 4
One option is to use to_char function and its iw parameter:
SQL> with test (name, datum) as
2 (select 'kc', date '2021-10-04' from dual union all
3 select 'vc', date '2021-10-06' from dual union all
4 select 'vk', date '2021-10-07' from dual union all
5 select 'cv', date '2021-10-12' from dual union all
6 select 'ck', date '2021-10-14' from dual union all
7 select 'vv', date '2021-10-19' from dual union all
8 select 'vx', DATE '2021-10-29' from dual
9 )
10 select to_char(datum, 'iw') week,
11 count(*)
12 from test
13 where datum >= add_months(sysdate, -1) --> the last month
14 group by to_char(datum, 'iw');
WE COUNT(*)
-- ----------
42 1
43 1
40 3
41 2
SQL>
Line #13: I intentionally used "one month" instead of "4 weeks" as I thought (maybe wrongly) that you, actually, want that (you know, "a month has 4 weeks" - not exactly, but close, sometimes not close enough).
If you want 4 weeks, what is that, then? Sysdate minus 28 days (as every week has 7 days)? Then you'd modify line #13 to
where datum >= trunc(sysdate - 4*7)
Or, maybe it is really the last 4 weeks:
SQL> with test (name, datum) as
2 (select 'kc', date '2021-10-04' from dual union all
3 select 'vc', date '2021-10-06' from dual union all
4 select 'vk', date '2021-10-07' from dual union all
5 select 'cv', date '2021-10-12' from dual union all
6 select 'ck', date '2021-10-14' from dual union all
7 select 'vv', date '2021-10-19' from dual union all
8 select 'vx', DATE '2021-10-29' from dual
9 ),
10 temp as
11 (select to_char(datum, 'iw') week,
12 count(*) cnt,
13 row_number() over (order by to_char(datum, 'iw') desc) rn
14 from test
15 group by to_char(datum, 'iw')
16 )
17 select week, cnt
18 from temp
19 where rn <= 4
20 order by week;
WE CNT
-- ----------
40 3
41 2
42 1
43 1
SQL>
Now you have several options, see which one fits the best (if any).
I "simulated" missing data (see TEST CTE), created a calendar (calend) and ... did the job. Read comments within code:
SQL> with test (name, datum) as
2 -- sample data
3 (select 'vv', date '2021-10-19' from dual union all
4 select 'vx', DATE '2021-10-29' from dual
5 ),
6 calend as
7 -- the last 31 days; 4 weeks are included, obviously
8 (select max_datum - level + 1 datum
9 from (select max(a.datum) max_datum from test a)
10 connect by level <= 31
11 ),
12 joined as
13 -- joined TEST and CALEND data
14 (select to_char(c.datum, 'iw') week,
15 t.name
16 from calend c left join test t on t.datum = c.datum
17 ),
18 last4 as
19 -- last 4 weeks
20 (select week, count(name) cnt,
21 row_number() over (order by week desc) rn
22 from joined
23 group by week
24 )
25 select week, cnt
26 from last4
27 where rn <= 4
28 order by week;
WE CNT
-- ----------
40 0
41 0
42 1
43 1
SQL>
I have the tables below and I need my query to bring me the amount of operations grouped by date.
For the dates on which there will be no operations, I need to return the date anyway with the zero count.
Kind like that:
OPERATION_DATE | COUNT_OPERATION | COUNT_OPERATION2 |
04/06/2019 | 453 | 81 |
05/06/2019 | 0 | 0 |
-- QUERY I TRIED
SELECT
T1.DATE_OPERATION AS DATE_OPERATION,
NVL(T1.COUNT_OPERATION, '0') COUNT_OPERATION,
NVL(T1.COUNT_OPERATION2, '0') COUNT_OPERATIONX,
FROM
(
SELECT
trunc(t.DATE_OPERATION) as DATE_OPERATION,
count(t.ID_OPERATION) AS COUNT_OPERATION,
COUNT(CASE WHEN O.OPERATION_TYPE = 'X' THEN 1 END) COUNT_OPERATIONX,
from OPERATION o
left join OPERATION_TYPE ot on ot.id_operation = o.id_operation
where ot.OPERATION_TYPE in ('X', 'W', 'Z', 'I', 'J', 'V')
and TRUNC(t.DATE_OPERATION) >= to_date('01/06/2019', 'DD-MM-YYYY')
group by trunc(t.DATE_OPERATION)
) T1
-- TABLES
CREATE TABLE OPERATION
( ID_OPERATION NUMBER NOT NULL,
DATE_OPERATION DATE NOT NULL,
VALUE NUMBER NOT NULL )
CREATE TABLE OPERATION_TYPE
( ID_OPERATION NUMBER NOT NULL,
OPERATION_TYPE VARCHAR2(1) NOT NULL,
VALUE NUMBER NOT NULL)
I guess that it is a calendar you need, i.e. a table which contains all dates involved. Otherwise, how can you display something that doesn't exist?
This is what you currently have (I'm using only the operation table; add another one yourself):
SQL> with
2 operation (id_operation, date_operation, value) as
3 (select 1, date '2019-06-01', 100 from dual union all
4 select 2, date '2019-06-01', 200 from dual union all
5 -- 02/06/2019 is missing
6 select 3, date '2019-06-03', 300 from dual union all
7 select 4, date '2019-06-04', 400 from dual
8 )
9 select o.date_operation,
10 count(o.id_operation)
11 from operation o
12 group by o.date_operation
13 order by o.date_operation;
DATE_OPERA COUNT(O.ID_OPERATION)
---------- ---------------------
01/06/2019 2
03/06/2019 1
04/06/2019 1
SQL>
As there are no rows that belong to 02/06/2019, query can't return anything (you already know that).
Therefore, add a calendar. If you already have that table, fine - use it. If not, create one. It is a hierarchical query which adds level to a certain date. I'm using 01/06/2019 as the starting point, creating 5 days (note the connect by clause).
SQL> with
2 operation (id_operation, date_operation, value) as
3 (select 1, date '2019-06-01', 100 from dual union all
4 select 2, date '2019-06-01', 200 from dual union all
5 -- 02/06/2019 is missing
6 select 3, date '2019-06-03', 300 from dual union all
7 select 4, date '2019-06-04', 400 from dual
8 ),
9 dates (datum) as --> this is a calendar
10 (select date '2019-06-01' + level - 1
11 from dual
12 connect by level <= 5
13 )
14 select d.datum,
15 count(o.id_operation)
16 from operation o full outer join dates d on d.datum = o.date_operation
17 group by d.datum
18 order by d.datum;
DATUM COUNT(O.ID_OPERATION)
---------- ---------------------
01/06/2019 2
02/06/2019 0 --> missing in source table
03/06/2019 1
04/06/2019 1
05/06/2019 0 --> missing in source table
SQL>
Probably a better option is to dynamically create a calendar so that it doesn't depend on any hardcoded values, but uses the min(date_operation) to max(date_operation) time span. Here we go:
SQL> with
2 operation (id_operation, date_operation, value) as
3 (select 1, date '2019-06-01', 100 from dual union all
4 select 2, date '2019-06-01', 200 from dual union all
5 -- 02/06/2019 is missing
6 select 3, date '2019-06-03', 300 from dual union all
7 select 4, date '2019-06-04', 400 from dual
8 ),
9 dates (datum) as --> this is a calendar
10 (select x.min_datum + level - 1
11 from (select min(o.date_operation) min_datum,
12 max(o.date_operation) max_datum
13 from operation o
14 ) x
15 connect by level <= x.max_datum - x.min_datum + 1
16 )
17 select d.datum,
18 count(o.id_operation)
19 from operation o full outer join dates d on d.datum = o.date_operation
20 group by d.datum
21 order by d.datum;
DATUM COUNT(O.ID_OPERATION)
---------- ---------------------
01/06/2019 2
02/06/2019 0 --> missing in source table
03/06/2019 1
04/06/2019 1
SQL>
Basically I have Product table like this:
date price
--------- -----
02-SEP-14 50
03-SEP-14 60
04-SEP-14 60
05-SEP-14 60
07-SEP-14 71
08-SEP-14 45
09-SEP-14 45
10-SEP-14 24
11-SEP-14 60
I need to update the table in this form
date price id
--------- ----- --
02-SEP-14 50 1
03-SEP-14 60 2
04-SEP-14 60 2
05-SEP-14 60 2
07-SEP-14 71 3
08-SEP-14 45 4
09-SEP-14 45 4
10-SEP-14 24 5
11-SEP-14 60 6
What I have tried:
CREATE SEQUENCE user_id_seq
START WITH 1
INCREMENT BY 1
CACHE 20;
ALTER TABLE Product
ADD (ID number);
UPDATE Product SET ID = user_id_seq.nextval;
This is updating the ID in the usual way like 1,2,3,4,5..
I have no idea how to do it using basic SQL commands. Please suggest how can I make it. Thank you in advance.
Here is one way to create a view from your base data. I assume you have more than one product (identified by product id), and that the price dates aren't necessarily consecutive. The sequence is separate for each product id. (Also, product should be the name of a different table - where the product id is primary key, and you have other information such as product name, category, etc. The table in your post would be more properly called something like price_history.)
alter session set nls_date_format='dd-MON-rr';
create table product ( prod_id number, dt date, price number );
insert into product ( prod_id, dt, price )
select 101, '02-SEP-14', 50 from dual union all
select 101, '03-SEP-14', 60 from dual union all
select 101, '04-SEP-14', 60 from dual union all
select 101, '05-SEP-14', 60 from dual union all
select 101, '07-SEP-14', 71 from dual union all
select 101, '08-SEP-14', 45 from dual union all
select 101, '09-SEP-14', 45 from dual union all
select 101, '10-SEP-14', 24 from dual union all
select 101, '11-SEP-14', 60 from dual union all
select 102, '02-SEP-14', 45 from dual union all
select 102, '04-SEP-14', 45 from dual union all
select 102, '05-SEP-14', 60 from dual union all
select 102, '06-SEP-14', 50 from dual union all
select 102, '09-SEP-14', 60 from dual
;
commit;
create view product_vw ( prod_id, dt, price, seq ) as
select prod_id, dt, price,
count(flag) over (partition by prod_id order by dt)
from ( select prod_id, dt, price,
case when price = lag(price) over (partition by prod_id order by dt)
then null else 1 end as flag
from product
)
;
Now check what the view looks like:
select * from product_vw;
PROD_ID DT PRICE SEQ
------- ------------------- ---------- ----------
101 02/09/0014 00:00:00 50 1
101 03/09/0014 00:00:00 60 2
101 04/09/0014 00:00:00 60 2
101 05/09/0014 00:00:00 60 2
101 07/09/0014 00:00:00 71 3
101 08/09/0014 00:00:00 45 4
101 09/09/0014 00:00:00 45 4
101 10/09/0014 00:00:00 24 5
101 11/09/0014 00:00:00 60 6
102 02/09/0014 00:00:00 45 1
102 04/09/0014 00:00:00 45 1
102 05/09/0014 00:00:00 60 2
102 06/09/0014 00:00:00 50 3
102 09/09/0014 00:00:00 60 4
NOTE: This answers the question that was originally asked. The OP changed the data.
If your data is not too large, you can use a correlated subquery:
update product p
set id = (select count(distinct p2.price)
from product p2
where p2.date <= p.date
);
If your data is larger, then merge is more appropriate.
WITH cts AS
(
SELECT row_number() over (partition by price order by price ) as id
,date
,price
FROM Product
)
UPDATE p
set p.id = cts.id
from product p join cts on cts.id = p.id
This is the best way by which you try to do.
There is no another simple way to do this using simple statements
Thanks for taking the time to examine my issue.
I'm trying to figure out a way to return dates when an account reaches 0
Sample data:
DATE ACCOUNT AMOUNT
11/01 001 100
11/02 002 50
11/03 001 -100
11/07 001 20
11/15 002 -50
11/20 001 -20
Wanted results:
Account ZeroDate
001 11/03
002 11/15
001 11/20
So far I haven't been able to figure out anything that works. Might you be able to point me in the right direction?
Thanks again in advance!
You can use analytic functions to compute the running balance
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select date '2011-11-01' dt, 1 account, 100 amt from dual union all
3 select date '2011-11-02', 2, 50 from dual union all
4 select date '2011-11-03', 1, -100 from dual union all
5 select date '2011-11-07', 1, 20 from dual union all
6 select date '2011-11-15', 2, -50 from dual union all
7 select date '2011-11-20', 1, -20 from dual
8 )
9 select dt,
10 account,
11 amt,
12 sum(amt) over (partition by account order by dt) current_balance
13* from x
SQL> /
DT ACCOUNT AMT CURRENT_BALANCE
--------- ---------- ---------- ---------------
01-NOV-11 1 100 100
03-NOV-11 1 -100 0
07-NOV-11 1 20 20
20-NOV-11 1 -20 0
02-NOV-11 2 50 50
15-NOV-11 2 -50 0
6 rows selected.
and then use the running balance to find the zero dates.
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select date '2011-11-01' dt, 1 account, 100 amt from dual union all
3 select date '2011-11-02', 2, 50 from dual union all
4 select date '2011-11-03', 1, -100 from dual union all
5 select date '2011-11-07', 1, 20 from dual union all
6 select date '2011-11-15', 2, -50 from dual union all
7 select date '2011-11-20', 1, -20 from dual
8 )
9 select account,
10 dt zero_date
11 from (
12 select dt,
13 account,
14 amt,
15 sum(amt) over (partition by account order by dt) current_balance
16 from x
17 )
18* where current_balance = 0
SQL> /
ACCOUNT ZERO_DATE
---------- ---------
1 03-NOV-11
1 20-NOV-11
2 15-NOV-11
create table myacct (dt varchar2(5)
, account varchar2(3)
, amount number
)
;
insert into myacct values ('11/01', '001', 100);
insert into myacct values ('11/02', '002', 50);
insert into myacct values ('11/03', '001', -100);
insert into myacct values ('11/07', '001', 20);
insert into myacct values ('11/15', '002', -50);
insert into myacct values ('11/20', '001', -20);
commit;
/* results wanted:
Account ZeroDate
001 11/03
002 11/15
001 11/20 */
select account "Account", dt "ZeroDate"
from myacct
where amount <= 0
;
/* results from above query:
Account ZeroDate
001 11/03
002 11/15
001 11/20
*/