I have a table containing Dates and Statuses. I wish to get the date that the status changed to the most recent status. Sample data:
DATE STATUS
01/01/2000 P
02/01/2000 A
03/01/2000 C
04/01/2000 A
05/01/2000 A
06/01/2000 A
So in this instance the most recent status is A and it changed to this on 04/01/2000. (The 02/01/2000 row should be ignored in this situation)
Any suggestions for how to go about selecting this row?
At first, I misunderstood the question. You need to get the earliest date of the last status.
You can group sequences of like statuses using a trick -- a difference of row numbers. The difference (in the query below) is constant for sequences that are the same. Then you can use aggregation to get the minimum date and select the latest one:
select mindate
from (select min(date) as mindate
from (select t.*,
row_number() over (order by date) as seqnum1,
row_number() over (partition by status order by date) as seqnum2
from table t
) t
group by status, (seqnum1 - seqnum2)
order by mindate desc
) t
where rownum = 1
EDIT:
In any case, the right way to do this is using lag():
select max(date)
from (select t.*, lag(status) over (order by date) as prev_status
from table t
)
where prev_status <> status or prev_status is null;
Here is the SQL Fiddle.
You can do this using lag or lead. Here I'm using lead, ordering by date descending to find the previous status date (if it's null I'm just supplying the date, which is needed in case there's only one record).
select max(date)
from (
select status, date, nvl(lead(status) over (order by date desc),date) as previous_status
from t
order by date desc
)
where status <> previous_status;
Something like this ought to do the trick:
with sample_data as (select to_date('01/01/2000', 'dd/mm/yyyy') dt, 'P' status from dual union all
select to_date('02/01/2000', 'dd/mm/yyyy') dt, 'A' status from dual union all
select to_date('03/01/2000', 'dd/mm/yyyy') dt, 'C' status from dual union all
select to_date('04/01/2000', 'dd/mm/yyyy') dt, 'A' status from dual union all
select to_date('05/01/2000', 'dd/mm/yyyy') dt, 'A' status from dual union all
select to_date('06/01/2000', 'dd/mm/yyyy') dt, 'A' status from dual),
results1 as (select dt,
status,
row_number() over (order by dt) - row_number() over (partition by status order by dt) grp
from sample_data),
results2 as (select status, min(dt) min_dt, grp, max(min(dt)) over () max_min_dt
from results1
group by status, grp)
select status, min_dt
from results2
where min_dt = max_min_dt;
STATUS MIN_DT
------ ----------
A 04/01/2000
Related
I need to get max date for each row over other ids. Of course I can do this with CROSS JOIN and JOIN .
Like this
WITH t AS (
SELECT 1 AS id, rep_date FROM UNNEST(GENERATE_DATE_ARRAY('2021-09-01','2021-09-09', INTERVAL 1 DAY)) rep_date
UNION ALL
SELECT 2 AS id, rep_date FROM UNNEST(GENERATE_DATE_ARRAY('2021-08-20','2021-09-03', INTERVAL 1 DAY)) rep_date
UNION ALL
SELECT 3 AS id, rep_date FROM UNNEST(GENERATE_DATE_ARRAY('2021-08-25','2021-09-05', INTERVAL 1 DAY)) rep_date
)
SELECT id, rep_date, MAX(rep_date) OVER (PARTITION BY id) max_date, max_date_over_others FROM t
JOIN (
SELECT t.id, MAX(max_date) max_date_over_others FROM t
CROSS JOIN (
SELECT id, MAX(rep_date) max_date FROM t
GROUP BY 1
) t1
WHERE t1.id <> t.id
GROUP BY 1
) USING (id)
But it's too wired for huge tables. So I'm looking for the some simpler way to do this. Any ideas?
Your version is good enough I think. But if you want to try other options - consider below approach. It might looks more verbose from first look - but should be more optimal and cheaper to compare with your version with cross join
temp as (
select id,
greatest(
ifnull(max(max_date_for_id) over preceding_ids, '1970-01-01'),
ifnull(max(max_date_for_id) over following_ids, '1970-01-01')
) as max_date_for_rest_ids
from (
select id, max(rep_date) max_date_for_id
from t
group by id
)
window
preceding_ids as (order by id rows between unbounded preceding and 1 preceding),
following_ids as (order by id rows between 1 following and unbounded following)
)
select *
from t
join temp
using (id)
Assuming your original table data just has columns id and dt - wouldn't this solve it? I'm using the fact that if an id has the max dt of everything, then it gets the second-highest over the other id values.
WITH max_dates AS
(
SELECT
id,
MAX(dt) AS max_dt
FROM
data
GROUP BY
id
),
with_top1_value AS
(
SELECT
*,
MAX(dt) OVER () AS max_overall_dt_1,
MIN(dt) OVER () AS min_overall_dt
FROM
max_dates
),
with_top2_values AS
(
SELECT
*,
MAX(CASE WHEN dt = max_overall_dt_1 THEN min_overall_dt ELSE dt END) AS max_overall_dt2
FROM
with_top1_value
),
SELECT
*,
CASE WHEN dt = max_overall_dt1 THEN max_overall_dt2 ELSE max_overall_dt1 END AS max_dt_of_others
FROM
with_top2_values
I have data date, id, and flag on this table. How I can get the value column where this column is incremental number and reset from 1 when there are any change in flag column?
Consider below approach
select * except(changed, grp),
row_number() over(partition by id, grp order by date) value
from (
select *, countif(changed) over(partition by id order by date) grp
from (
select *,
ifnull(flag != lag(flag) over(partition by id order by date), true) changed
from `project.dataset.table`
))
if applied to sample data in your question - output is
You seem to want to count the number of falses since the last true. You can use:
select t.* except (grp),
(case when flag
then 1
else row_number() over (partition by id, grp order by date) - 1
end)
from (select t.*,
countif(flag) over (partition by id order by date) as grp
from t
) t;
If you know that the dates have no gaps, you can actually do this without a subquery:
select t.*,
(case when flag then 1
else date_diff(date,
max(case when flag then date end) over (partition by id),
day)
end)
from t;
I want to fetch customers balances at the maximum date of every month, in every year in database. The Balance table has balances at the end of everyday when customer does transaction.
I just want to pick the balance at the maximum date of every month.Any help??
Below is a snip of My dataset.
You can try using window function - row_number()
select * from
(
SELECT *,row_number() over(partition by extract(YEAR FROM Date), extract(MONTH FROM Date) order by date desc) as rn
FROM t
)rn=1
You can do it also without a sub-query:
WITH b(ID, "date",bal) AS
(
SELECT 'CUST_I',DATE '2013-07-27', 14777.44 FROM dual UNION ALL
SELECT 'CUST_H',DATE '2013-07-26', 71085.13 FROM dual UNION ALL
SELECT 'CUST_I',DATE '2013-08-27', 66431.35656 FROM dual UNION ALL
SELECT 'CUST_H',DATE '2013-08-26', 63102.68622 FROM dual UNION ALL
SELECT 'CUST_H',DATE '2013-08-20', 6310.68622 FROM dual UNION ALL
SELECT 'CUST_H',DATE '2013-08-10', 630.68622 FROM dual UNION ALL
SELECT 'CUST_G',DATE '2013-09-25', 89732.04889 FROM dual UNION ALL
SELECT 'CUST_E',DATE '2013-09-23', 83074.70822 FROM dual
)
SELECT ID,
MAX("date") KEEP (DENSE_RANK FIRST ORDER BY "date" desc) AS MAX_DATE,
MAX(bal) KEEP (DENSE_RANK FIRST ORDER BY "date" desc) AS MAX_BAL
FROM b
GROUP BY ID, TRUNC("date", 'MM');
+-----------------------------+
|ID |MAX_DATE |MAX_BAL |
+-----------------------------+
|CUST_E|23.09.2013|83074.70822|
|CUST_G|25.09.2013|89732.04889|
|CUST_H|26.07.2013|71085.13 |
|CUST_H|26.08.2013|63102.68622|
|CUST_I|27.07.2013|14777.44 |
|CUST_I|27.08.2013|66431.35656|
+-----------------------------+
You may use a self join for your table call cust_balances :
select c1.*
from cust_balances c1
join
(
select max("date") max_date
from cust_balances
group by to_char("date",'yyyymm')
) c2 on ( c1."date" = c2.max_date );
SQL Fiddle Demo
How can I in Oracle with SQL retrieve for a table each first Column A,B, in case column B changes the value ordered by A???
Assume I have a table with date and value:
DATE;VALUE
01-2015;1
02-2015;1
01-2016;2
01-2016;2
01-2017:1
So what I want now, is each first line once the value changes (based on certain orderning here DATE) so from this set I want:
DATE;VALUE
01-2015;1
01-2016;2
01-2017:1
Now I cannot use a simply GROUP BY VALUE, because the value can flip back again (in this case to 1 in 2015 and 2017) and MIN(DATECOL) GROUP BY VALUECOL will not report this 2017.
So I was looking into Analytical functions something like:
SELECT FIRST_VALUE(DATECOL),FIRST_VALUE(VALUECOL) OVER (PARTITION BY
VALUECOL ORDER BY DATECOL) FROM DATATABLE
But I cannot get this working!
Tabibtosan makes this easy:
with table1 as (select to_date('01/01/2015', 'dd/mm/yyyy') dt, 1 val from dual union all
select to_date('01/02/2015', 'dd/mm/yyyy') dt, 1 val from dual union all
select to_date('01/01/2016', 'dd/mm/yyyy') dt, 2 val from dual union all
select to_date('01/01/2016', 'dd/mm/yyyy') dt, 2 val from dual union all
select to_date('01/01/2017', 'dd/mm/yyyy') dt, 1 val from dual)
-- end of mimicking a table "table1" with data in it. See sql below:
select min(dt) dt,
val
from (select dt,
val,
dense_rank() over (order by dt)
- dense_rank() over (partition by val order by dt) grp
from table1)
group by val,
grp;
DT VAL
---------- ----------
01/01/2015 1
01/01/2016 2
01/01/2017 1
I think LAG() is the appropriate function, along with some other logic:
select t.*
from (select t.*, lag(value) over (order by date) as prev_value
from datatable t
) t
where prev_value is null or prev_value <> value;
The only issue with your data is that the rows are not unique. This can cause a problem, because sorting in databases is not stable (that is, two rows can be in either order). Hopefully, in your actual data, the dates are unique or you have another id you can add to the order by to make the sort stable.
One brute force way of doing this is:
with dt as (
select dt.*, rownum as rn
from datatable dt
)
select t.*
from (select dt.*, lag(value) over (order by date, rn) as prev_value
from datatable dt
) t
where prev_value is null or prev_value <> value;
I have a table wherein I have to report the the present status and the date from which this status is applicable.
Example:
Status date
1 26 July
1 24 July
1 22 July
2 21 July
2 19 July
1 16 July
0 14 July
Given this, i want to display the current status as 1 and date as 22 July> I am not sure how to go about this.
Status date
1 25 July
1 24 July
1 20 July
In this case, I want to show the status as 1 and date as 20th July
This should pull what you need using very standard SQL:
-- Get the oldest date that is the current Status
select Status, min(date) as date
from MyTable
where date > (
-- Get the most recent date that isn't the current Status
select max(date)
from MyTable
where Status != (
-- Get the current Status
select Status -- May need max/min here for multiple statuses on same date
from MyTable
where date = (
-- Get the most recent date
select max(date)
from MyTable
)
)
)
group by Status
I'm assuming that the date column is of a data type suitable for sorting properly (as in, not a string, unless you can cast it).
This is a little inelegant, but it should work
SELECT status, date
FROM my_table t
WHERE status = ALL (SELECT status
FROM my_table
WHERE date = ALL(SELECT MAX(date) FROM my_table))
AND date = ALL (SELECT MIN(date)
FROM my_table t1
WHERE t1.status = t.status
AND NOT EXISTS (SELECT *
FROM my_table t2
WHERE t2.date > t1.date AND t2.status <> t1.status))
Another option is to use a window function like LEAD (or LAG depending on how you order your results). In this example we mark the row when the status changes with the date, order the results and exclude rows other than the first one:
with test_data as (
select 1 status, date '2012-07-26' status_date from dual union all
select 1 status, date '2012-07-24' status_date from dual union all
select 1 status, date '2012-07-22' status_date from dual union all
select 2 status, date '2012-07-21' status_date from dual union all
select 2 status, date '2012-07-19' status_date from dual union all
select 1 status, date '2012-07-16' status_date from dual union all
select 0 status, date '2012-07-14' status_date from dual)
select status, as_of
from (
select status
, case when status != lead(status) over (order by status_date desc) then status_date else null end as_of
from test_data
order by as_of desc nulls last
)
where rownum = 1;
Addendum:
The LEAD and LAG functions accept two more parameters: offset and default. The offset defaults to 1, and default defaults to null. The default allows you to determine what value to consider when you are at the beginning or end of the result set. In your case when the status has never changed, a default is needed. In this example I supplied -1 as a status default because I am assuming that status value is not part of your expected set:
with test_data as (
select 1 status, date '2012-07-25' status_date from dual union all
select 1 status, date '2012-07-24' status_date from dual union all
select 1 status, date '2012-07-20' status_date from dual)
select status, as_of
from (
select status
, case when status != lead(status,1,-1) over (order by status_date desc) then status_date else null end as_of
from test_data
order by as_of desc nulls last
)
where rownum = 1;
You can play around with the case condition (equals/not equals), the order by clause in the lead function, and the desired default to accomplish your needs.