I'm still a novice to SQL so I'm going to give this a try. Hopefully someone can help! I have the following data set:
image
REQ_NUM
DATE
EVENT
USER_ENTR
23877
2022-03-24 00:00:00.0
Posted
John
23877
2022-04-03 00:00:00.0
Expired
John
23877
2022-05-03 00:00:00.0
Posted
Jane
23877
2022-05-09 00:00:00.0
Expired
Jane
23877
2022-05-27 00:00:00.0
Posted
John
23877
2022-06-17 00:00:00.0
Unposted
John
Basically, what I am trying to do is create a row for each start (posted) and end (expired, unposted) date like so:
REQ_NUM
START_DT
END_DT
23877
2022-03-24 00:00:00.0
2022-04-03 00:00:00.0
23877
2022-05-03 00:00:00.0
2022-05-09 00:00:00.0
23877
2022-05-27 00:00:00.0
2022-06-17 00:00:00.0
This will be used to calculate time posted between stints as well as trending when a requisition was actually posted for candidates to apply.
I think I need to use some kind of loop, but I don't even know where to start honestly. I've tried searching but I don't think I really know what to search for so even just a clue of what I need to look for would help.
I appreciate any help you can provide!
I thought maybe a grouping with min and max dates, but there are gaps before requisitions were even posted again so it's misleading.
I'm assuming you're using a currently supported Oracle version, so it should support match_recognize:
select *
from your_table
match_recognize (
partition by req_num
order by trans_date
measures
first(trans_date) as start_dt,
last(trans_date) as end_dt,
last(event) as end_event
pattern (start_ next_+)
define
start_ as (event='Posted'),
next_ as (event!='Posted')
)
Full DBFiddle example: https://dbfiddle.uk/4Fai2-ZQ
If your actual data looks like in the question - you could use conditional LAG() and LEAD() analytic functions to get starting and ending dates in the same row.
SELECT DISTINCT
REQ_NUM,
CASE WHEN EVENT = 'Posted' THEN A_DATE ELSE LAG(A_DATE) OVER(Order By REQ_NUM, A_DATE) END "START_DT",
CASE WHEN EVENT != 'Posted' THEN A_DATE ELSE LEAD(A_DATE) OVER(Order By REQ_NUM, A_DATE) END "END_DT"
FROM tbl
... which with the sample data ...
WITH
tbl (REQ_NUM, A_DATE, EVENT, USER_ENTR) AS
(
Select 23877, To_Date('2022-03-24', 'yyyy-mm-dd'), 'Posted', 'John' From Dual Union All
Select 23877, To_Date('2022-04-03', 'yyyy-mm-dd'), 'Expired', 'John' From Dual Union All
Select 23877, To_Date('2022-05-03', 'yyyy-mm-dd'), 'Posted', 'Jane' From Dual Union All
Select 23877, To_Date('2022-05-09', 'yyyy-mm-dd'), 'Expired', 'Jane' From Dual Union All
Select 23877, To_Date('2022-05-27', 'yyyy-mm-dd'), 'Posted', 'John' From Dual Union All
Select 23877, To_Date('2022-06-17', 'yyyy-mm-dd'), 'Unposted', 'John' From Dual
)
... should result with:
REQ_NUM START_DT END_DT
---------- --------- ---------
23877 24-MAR-22 03-APR-22
23877 03-MAY-22 09-MAY-22
23877 27-MAY-22 17-JUN-22
Note: The DISTINCT keyword eliminates duplicated rows and could be performance costly with large datasets.
Related
I am trying to work on something in Oracle SQL in the attached table.
The goal is to define a Visit Type where a Primary Visit refers to any visits that happened after 3 months from previous visit(s). However, the challenge is that sometimes I'll need to compare to previous row and sometimes I need to compare with previous N rows.
For e.g., Transaction ID 3 is a revisit because its start date is within the 'end date +90 days' of Transaction ID 1 (Dec 16). Transaction ID 4 is primary because it happened after it took place after the the very first 'end date+90 days', meaning I not only need to compare to previous 1 row but previous 3 rows.
Hope this is clear!
Thanks.
See Details above. Thank you!
From Oracle 12 you can use MATCH_RECOGNIZE for row-by-row pattern matching:
SELECT transaction_id, start_date, end_date, visit_type
FROM table_name
MATCH_RECOGNIZE(
ORDER BY start_date
MEASURES
CLASSIFIER() AS visit_type
ALL ROWS PER MATCH
PATTERN (PRIMARY revisit*)
DEFINE revisit AS start_date <= FIRST(end_date) + INTERVAL '90' DAY
);
Which, for the sample data:
CREATE TABLE table_name (transaction_id, start_date, end_date) AS
SELECT 1, DATE '2020-08-05', DATE '2020-09-07' FROM DUAL UNION ALL
SELECT 2, DATE '2020-09-19', DATE '2020-10-27' FROM DUAL UNION ALL
SELECT 3, DATE '2020-11-01', DATE '2020-12-19' FROM DUAL UNION ALL
SELECT 4, DATE '2021-01-23', DATE '2021-01-26' FROM DUAL UNION ALL
SELECT 5, DATE '2021-02-27', DATE '2021-03-27' FROM DUAL;
Outputs:
TRANSACTION_ID
START_DATE
END_DATE
VISIT_TYPE
1
2020-08-05 00:00:00
2020-09-07 00:00:00
PRIMARY
2
2020-09-19 00:00:00
2020-10-27 00:00:00
REVISIT
3
2020-11-01 00:00:00
2020-12-19 00:00:00
REVISIT
4
2021-01-23 00:00:00
2021-01-26 00:00:00
PRIMARY
5
2021-02-27 00:00:00
2021-03-27 00:00:00
REVISIT
i have one table the query is like this
SELECT
cost_center_name,
person_number,
person_full_name,
TO_DATE(TO_CHAR(wfc_start_date,'DD-MON-YYYY HH:MI:SS AM'),'DD-MON-YYYY HH:MI:SS AM') start_date,
TO_DATE(TO_CHAR(wfc_end_date,'DD-MON-YYYY HH:MI:SS AM'),'DD-MON-YYYY HH:MI:SS AM') end_date,
TO_CHAR(wfc_start_date,'DD-MON-YYYY HH:MI:SS AM') start_date_hours,
TO_CHAR(wfc_end_date,'DD-MON-YYYY HH:MI:SS AM') end_date_hours,
pay_code_name,
duration_dd_hh_mi_ss,
wage_amount,
FROM
XX_pay_type a
WHERE
person_number IN (
'102',
'103'
)
AND pay_period_ending_date = '20-APR-2019'
the duration_dd_hh_mi_ss,
has values like 00:4:32:00
00:3:20:00
i want the sum(wage_amount) and sum(duration_dd_hh_mi_ss) from the sql query
against person_number
i also want grand_total (wage_amount)
sum(duration_dd_hh_mi_ss) will be in this case 00:7:52:00
i tried sum(wage_amount) over(partition by person_number)
but i cannot get the sum(duration_dd_hh_mi_ss) against the person_number and the grand(wage_total)
If duration is substraction of end_date and start_date then use substraction. Do not follow first step.
1
Otherwise use to_dsinterval(). But to be able to use it we need to replace first : in duration with space, because to_dsinterval does not accept this parameter.
select regexp_replace('00:2:32:00', ':', ' ', 1, 1) from dual; --> 00 2:32:00
select to_dsinterval(regexp_replace('00:2:32:00', ':', ' ', 1, 1)) from dual;
Now we have intervals which can be summed, with some effort. It is not trivial, but possible. Strings cannot.
2
To get partial sums and grand totals in one query use aggregation with cube, rollup or grouping sets. Here is an example with rollup:
-- sample data
with t(center, id, name, duration, wage_amount) as (
select 'C1', '102', 'Tim', '01:2:00:00', 80 from dual union all
select 'C1', '102', 'Tim', '00:0:32:00', 70 from dual union all
select 'C2', '102', 'Tim', '00:2:00:00', 100 from dual union all
select 'C2', '103', 'Bob', '00:2:00:00', 120 from dual )
-- end of sample data
query:
select center, id, name, sum(wage_amount) amt,
numtodsinterval(sum(sysdate + to_dsinterval(regexp_replace(duration, ':', ' ', 1, 1))
- sysdate ), 'day') duration
from t
group by rollup((id, name), center)
Result:
CENTER ID NAME AMT DURATION
------ --- ---- ---------- -------------------
C1 102 Tim 150 +000000001 02:32:00
C2 102 Tim 100 +000000000 02:00:00
102 Tim 250 +000000001 04:32:00
C2 103 Bob 120 +000000000 02:00:00
103 Bob 120 +000000000 02:00:00
370 +000000001 06:32:00
You can also do it as you started, using analytical sums which will give you values in additional columns. Or make a union (union all) of your data and grouping query.
What is the data type of 'duration_dd_hh_mi_ss' column?
I have a log table that contains (to be simple), user, operation, date.
There are two operations: search and view (search may return a hundred records; the user may view zero or more).
I need to have the basic output sorted by date, but I also need to have all of the views for one search together. Something like
name operation date
john search 1/1 1pm
john view 1/1 2pm
john view 1/1 3pm
james search 1/1 230pm
james view 1/1 315pm
john search 1/1 310pm
It seems I need to use the results of a subquery to perform the query, but I'm not sure how that would look. I'm OK with SQL but I kind of hit the ceiling with JOINs and UNIONs. :-/
You can identify the groups by using a window function. And you can include the window function in the order by, so no subqueries are needed.
select *
from log_table l
order by max(case when l.operation = 'search' then l.log_date end) over (partition by l.name order by l.log_date),
l.name,
l.log_date;
Here is a db<>fiddle.
You can use a conditional lag() call to find the most recent search date/time for each view row, per user; with search rows getting their own date/time:
-- CTE for sample data
with log_table (name, operation, log_date) as (
select 'john', 'search', timestamp '2019-01-01 13:00:00' from dual
union all select 'john', 'view', timestamp '2019-01-01 14:00:00' from dual
union all select 'john', 'view', timestamp '2019-01-01 15:00:00' from dual
union all select 'james', 'search', timestamp '2019-01-01 14:30:00' from dual
union all select 'james', 'view', timestamp '2019-01-01 15:15:00' from dual
union all select 'john', 'search', timestamp '2019-01-01 15:10:00' from dual
)
-- actual query
select name, operation, log_date,
case when operation = 'search' then log_date
else lag(case when operation = 'search' then log_date end ignore nulls)
over (partition by name order by log_date)
end as search_date
from log_table
order by log_date;
NAME OPERATION LOG_DATE SEARCH_DATE
----- --------- ------------------- -------------------
john search 2019-01-01 13:00:00 2019-01-01 13:00:00
john view 2019-01-01 14:00:00 2019-01-01 13:00:00
james search 2019-01-01 14:30:00 2019-01-01 14:30:00
john view 2019-01-01 15:00:00 2019-01-01 13:00:00
john search 2019-01-01 15:10:00 2019-01-01 15:10:00
james view 2019-01-01 15:15:00 2019-01-01 14:30:00
You can then use that as a CTE or inline view, and use the generated search_date to order first, then order the records with the same search date by their actual log date:
-- CTE for sample data
with log_table (name, operation, log_date) as (
select 'john', 'search', timestamp '2019-01-01 13:00:00' from dual
union all select 'john', 'view', timestamp '2019-01-01 14:00:00' from dual
union all select 'john', 'view', timestamp '2019-01-01 15:00:00' from dual
union all select 'james', 'search', timestamp '2019-01-01 14:30:00' from dual
union all select 'james', 'view', timestamp '2019-01-01 15:15:00' from dual
union all select 'john', 'search', timestamp '2019-01-01 15:10:00' from dual
)
-- actual query
select name, operation, log_date
from (
select name, operation, log_date,
case when operation = 'search' then log_date
else lag(case when operation = 'search' then log_date end ignore nulls)
over (partition by name order by log_date)
end as search_date
from log_table
)
order by search_date, log_date;
NAME OPERATION LOG_DATE
----- --------- -------------------
john search 2019-01-01 13:00:00
john view 2019-01-01 14:00:00
john view 2019-01-01 15:00:00
james search 2019-01-01 14:30:00
james view 2019-01-01 15:15:00
john search 2019-01-01 15:10:00
As you could potentially get simultaneous searches from two users, you might want to include the user in the final order-by clause too:
...
order by search_date, name, log_date;
So i have a table called Value. This table has columns called: VALUE_ID, VALUE, HR, VALUE_TYPE. I am trying to grab not only the maximum value but also the HR (and ultimately the day) that the Maximum Value Occurred.
Below is some sample data:
VALUE_ID VALUE HR VALUE_TYPE
1 75 DEC-25-2018 01:00:00 AM Bananas
2 10 DEC-25-2018 01:00:00 AM Bananas
3 787 DEC-25-2018 05:00:00 PM Bananas
I want:
(For Hourly)
MAX(Value) HR Value_Type
75 DEC-25-2018 01:00:00 AM Bananas
787 DEC-25-2018 05:00:00 PM Bananas
(For Day)
MAX(Value) HR(Day) Value_Type
787 DEC-25-2018 05:00:00 PM Bananas
I've tried the following (this is probably completely wrong but im not sure how to combine columns from two separate queries into one table):
select max(value) as max_value
, Value_Type
from value
group by value_type
UNION
select HR from
from value
where value = (select max(value) as max_value
, Value_Type
from value
group by value_type;
Thanks in advance.
Analytic functions are perfect for this kind of question. They allow the base data to be read just once (rather than multiple times, as in solutions that aggregate, then compare to the original data).
In the sample session below, notice a few things. I create some sample data in a WITH clause (which you don't need, you have the actual table). I use the TO_DATE function to create dates. So that I don't need to write the format model multiple times, I first alter my session to set the default date format to the one you used.
The query is written for the hourly intervals; you can modify it easily for daily, by changing the argument to TRUNC() in the inner query, from 'hh' to 'dd'. If you are new to analytic functions, select and run the inner query by itself first, to see what it produces. Then it will be trivial to understand what the outer query does.
alter session set nls_date_format = 'MON-dd-yyyy hh:mi:ss AM';
with simulated_table (VALUE_ID, VALUE, HR, VALUE_TYPE) as (
select 1, 75, to_date('DEC-25-2018 01:00:00 AM'), 'Bananas' from dual union all
select 2, 10, to_date('DEC-25-2018 01:00:00 AM'), 'Bananas' from dual union all
select 3, 787, to_date('DEC-25-2018 05:00:00 PM'), 'Bananas' from dual
)
select value_id, value, hr, value_type
from (
select s.*,
max(value) over (partition by value_type, trunc(hr, 'hh')) maxval
from simulated_table s
)
where value = maxval
;
VALUE_ID VALUE HR VALUE_TYPE
---------- ---------- ----------------------- ----------
1 75 DEC-25-2018 01:00:00 AM Bananas
3 787 DEC-25-2018 05:00:00 PM Bananas
You could do:
select v.*
from value v
where v.value = (select max(v2.value)
from value v2
where trunc(v2.hr, 'HH')= trunc(v.hr, 'HH')
) or
v.value = (select max(v2.value)
from value v2
where trunc(v2.hr, 'DD')= trunc(v.hr, 'DD')
) ;
This gets the maximum value rows for both the hour and the day. You can use one clause or the other for just hours or just days.
I am trying to find all records in a database with an admission date which is older than a certain time frame (in this case, all admission dates older than 4 days old).
I have:
select memberid, admitdate
from membertable
where admitdate < (sysdate-4)
As a result, I'm getting a lot of admission dates which match this, but I'm ALSO getting dates which are from only 2 days ago, so that doesn't match my code. What am I doing wrong?
If it helps, the admit dates have a format of mm/dd/yyyy.
Dates, including sysdate, have a time component. Even if all your admitdate values are at midnight that is still a time, and sysdate is only going to be at midnight if you run your query then.
select sysdate, sysdate-4, trunc(sysdate), trunc(sysdate)-4 from dual;
SYSDATE SYSDATE-4 TRUNC(SYSDATE) TRUNC(SYSDATE)-4
------------------- ------------------- ------------------- -------------------
2018-06-21 16:44:53 2018-06-17 16:44:53 2018-06-21 00:00:00 2018-06-17 00:00:00
If you filter your records on sysdate-4 then that will include any admitdate values up to, in this example, 2018-06-17 16:44:53; so presumably all the records for the 17th if they are actually all midnight.
with membertable (memberid, admitdate) as (
select 1, date '2018-06-15' from dual
union all select 2, date '2018-06-16' from dual
union all select 3, date '2018-06-17' from dual
union all select 4, date '2018-06-18' from dual
union all select 5, date '2018-06-19' from dual
union all select 6, date '2018-06-20' from dual
union all select 7, date '2018-06-21' from dual
)
select memberid, admitdate
from membertable
where admitdate < (sysdate-4);
MEMBERID ADMITDATE
---------- -------------------
1 2018-06-15 00:00:00
2 2018-06-16 00:00:00
3 2018-06-17 00:00:00
If you truncate the value you're comparing against then its time portion will also be treated as midnight, so you'll only match record up to - but not including - that point in time, 2018-06-17 00:00:00:
with membertable (memberid, admitdate) as (
select 1, date '2018-06-15' from dual
union all select 2, date '2018-06-16' from dual
union all select 3, date '2018-06-17' from dual
union all select 4, date '2018-06-18' from dual
union all select 5, date '2018-06-19' from dual
union all select 6, date '2018-06-20' from dual
union all select 7, date '2018-06-21' from dual
)
select memberid, admitdate
from membertable
where admitdate < trunc(sysdate)-4;
MEMBERID ADMITDATE
---------- -------------------
1 2018-06-15 00:00:00
2 2018-06-16 00:00:00
admitdate should be a date. You seem to be suggesting it is a string. You can try:
where to_date(admitdate, 'MM/DD/YYYY') < trunc(sysdate) - 4;
You can then fix the data in the table, so it is stored as a date.