Oracle optimise SQL query - Multiple Max() - sql

I have a table where first I need to select data by max(event_date) then need to
filter the data by max(event_sequence) then filter again by max(event_number)
I wrote following query which works but takes time.
Here the the query
SELECT DISTINCT a.stuid,
a.prog,
a.stu_prog_id,
a.event_number,
a.event_date,
a.event_sequence,
a.prog_status
FROM table1 a
WHERE a.event_date=
(SELECT max(b.event_date)
FROM table1 b
WHERE a.stuid=b.stuid
AND a.prog=b.prog
AND a.stu_prog_id=b.stu_prog_id)
AND a.event_seq=
(SELECT max(b.event_sequence)
FROM table1 b
WHERE a.stuid=b.stuid
AND a.prog=b.prog
AND a.stu_prog_id=b.stu_prog_id
AND a.event_date=b.event_date)
AND a.event_number=
(SELECT max(b.event_number)
FROM table1 b
WHERE a.stuid=b.stuid
AND a.prog=b.prog
AND a.stu_prog_id=b.stu_prog_id
AND a.event_date=b.event_date
AND a.event_sequence=b.event_sequence
I was wondering is there there a faster way to get the data?
I am using Oracle 12c.

You could try rephrasing your query using analytic functions:
SELECT
stuid,
prog,
stu_prog_id,
event_number,
event_date,
event_sequence,
prog_status
FROM
(
SELECT t.*,
RANK() OVER (PARTITION BY studio, prog, stu_prog_id
ORDER BY event_date DESC) rnk1,
RANK() OVER (PARTITION BY studio, prog, stu_prog_id, event_date
ORDER BY event_sequence DESC) rnk2,
RANK() OVER (PARTITION BY studio, prog, stu_prog_id, event_date, event_sequence
ORDER BY event_number DESC) rnk3
FROM table1 t
) t
WHERE rnk1 = 1 AND rnk2 = 1 AND rnk3 = 1;
Note: I don't actually know if you really need all three subqueries there. Adding sample data to your question might help someone else improve upon the solution I have given above.

I think you want a simple row_number() or rank():
select t1.*
from (select t1.*,
rank() over (partition by stuid, prog, stu_prog_id
order by event_date desc, event_sequence desc, event_number desc
) as seqnum
from table1 t1
) t1
where seqnum = 1;

If you have multiple records with EVENT_DATE, EVENT_SEQUENCE, EVENT_NUMBER as max respectively then in Tim's solution, Use DENSE_RANK or use the following to fetch the exact max and compare with original column data.
SELECT DISTINCT
A.STUID,
A.PROG,
A.STU_PROG_ID,
A.EVENT_NUMBER,
A.EVENT_DATE,
A.EVENT_SEQUENCE,
A.PROG_STATUS
FROM
(
SELECT
A.STUID,
A.PROG,
A.STU_PROG_ID,
A.EVENT_NUMBER,
A.EVENT_DATE,
A.EVENT_SEQUENCE,
A.PROG_STATUS,
MAX(A.EVENT_DATE) OVER(
PARTITION BY A.STUID, A.PROG, A.STU_PROG_ID
) AS MAX_EVENT_DATE,
MAX(A.EVENT_SEQUENCE) OVER(
PARTITION BY A.STUID, A.PROG, A.STU_PROG_ID, A.EVENT_DATE
) AS MAX_EVENT_SEQUENCE,
MAX(A.EVENT_NUMBER) OVER(
PARTITION BY A.STUID, A.PROG, A.STU_PROG_ID, A.EVENT_DATE, A.EVENT_SEQUENCE
) AS MAX_EVENT_NUMBER
FROM
TABLE1 A
) A
WHERE
A.MAX_EVENT_DATE = A.EVENT_DATE
AND A.MAX_EVENT_SEQUENCE = A.EVENT_SEQUENCE
AND A.MAX_EVENT_NUMBER = A.EVENT_NUMBER;
Cheers!!

As being an Oracle 12c user, you can use
[ OFFSET offset { ROW | ROWS } ]
[ FETCH { FIRST | NEXT } [ { rowcount | percent PERCENT } ]
{ ROW | ROWS } { ONLY | WITH TIES } ]
syntax as :
SELECT DISTINCT a.stuid,
a.prog,
a.stu_prog_id,
a.event_number,
a.event_date,
a.event_sequence,
a.prog_status
FROM table1 a
ORDER BY event_date DESC, event_sequence DESC, event_number DESC
FETCH FIRST 1 ROW ONLY;
where WITH TIES clause is not needed for your case, since you're looking for DISTINCT rows, and OFFSET is not needed either, since starting point is just the beginning of a descendingly ordered columns. Even, using the keyword ROW as ROWS is optional, even for the case of plural rows such as FETCH FIRST 5 ROW ONLY;
^^ --> ROWS without S
Demo

Related

Select rows based on distinct values of nested field in BigQuery

I have a table in BigQuery which looks like this:
The sequence field is a repeated RECORD. I want to select one row per stepName but if there are multiple rows per step name, I want to choose the one where sequence.step.elapsedSeconds and sequence.step.elapsedMinutes are not null, otherwise select the rows where these columns are null.
As shown in the image above, I want to select row no. 2, 4 and 5. I have calculated ROW_NUMBER like this: ROW_NUMBER() OVER(PARTITION BY step.stepName) AS RowNum.
HereĀ“s my query so far in trying to filter out the unwanted rows:
WITH DistinctRows AS
(
select timestamp,
ARRAY (
SELECT
STRUCT(
STRUCT(
step.elapsedSeconds,
step.elapsedMinutes,
) as step
)
FROM
UNNEST(source_table.sequence) AS sequence
) AS sequence,
ROW_NUMBER() OVER(PARTITION BY step.stepName) AS RowNum
from source_table,
unnest(sequence) as previousCalls
order by timestamp asc
)
SELECT *
FROM DistinctRows,
unnest(sequence) as sequence
where (rowNum = 1 and (step.elapsedSeconds is null and step.elapsedMinutes is null)
or (RowNum > 1 and step.elapsedSeconds is not null and step.elapsedSeconds is not null)
order by timestamp asc
I need help in figuring out how to filter out the rows like no. 1 and 3 and would appreciate some help.
Thanks in advance.
Hmmm . . . Assuming that stepname is not part of the repeated column:
SELECT dr.* EXCEPT (sequence),
(SELECT seq
FROM unnest(dr.sequence) seq
ORDER BY seq.step.elapsedSeconds DESC NULLS LAST,
sequence.step.elapsedMinutes DESC NULLS LAST
) as sequence
FROM DistinctRows dr
ORDER BY timestamp asc;
If stepname is part of sequence, then the subquery would reaggregate:
SELECT dr.* EXCEPT (sequence),
(SELECT ARRAY_AGG(sequence ORDER BY stepName)
FROM (SELECT seq,
ROW_NUMBER() OVER (PARTITION BY seq.stepName
ORDER BY seq.step.elapsedSeconds DESC NULLS LAST, sequence.step.elapsedMinutes DESC NULLS
) as seqnum
FROM unnest(dr.sequence) seq
) s
WHERE seqnum = 1
) as sequence
FROM DistinctRows dr
ORDER BY timestamp asc

Get the last time a value has changed in Google BigQuery

I have an employee database which contains records about employees. The fields are :
employee_identifier
employee_salary
date_of_the_record
I would like to get, for each record, the date of the last change in employee_salary. Which SQL query could work ?
I have tried with multiple sub-queries, but it does not work.
Below is for BigQuery Standard SQL
#standardSQL
SELECT * EXCEPT(arr),
(SELECT MAX(date_of_the_record) FROM UNNEST(arr)
WHERE employee_salary != t.employee_salary
) AS last_change_in_employee_salary
FROM (
SELECT *, ARRAY_AGG(STRUCT(employee_salary, date_of_the_record)) OVER(win) arr
FROM `project.dataset.employee_database`
WINDOW win AS (PARTITION BY employee_identifier ORDER BY date_of_the_record)
) t
use row_number()
with cte as
(
select *,
row_number()over(partition by employee_identifier order by date_of_the_record desc) rn from table_name
) select * from cte where rn=1
You can also do this without a subquery. If you want all the columns:
SELECT as value ARRAY_AGG(t ORDER BY date_of_the_record DESC LIMIT 1)[ordinal(1)]
FROM t t
GROUP BY employee_identifier;
If you just want the date, use GROUP BY:
SELECT employee_identifier, MAX(date_of_the_record)
FROM t t
GROUP BY employee_identifier;

How to get single closest value for each column type in DB2

I have this query:
SELECT * FROM TABLE1 WHERE KEY_COLUMN='NJCRF' AND TYPE_COLUMN IN ('SCORE1', 'SCORE2', 'SCORE3') AND DATE_EFFECTIVE_COLUMN<='2016-09-17'
I get about 12 record(rows) as result.
How to get result closest to DATE_EFFECTIVE_COLUMN for each TYPE_COLUMN? In this case, how to get three records, for each type, that are closest to effective date?
UPDATE: I could use TOP if I had to go over only single type, but I have three at this moment and for each of them I need to get closest time result.
Hope I made it clear, let me know if you need more info.
If I understand correctly, you can use ROW_NUMBER():
SELECT t.*
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY TYPE_COLUMN ORDER BY DATE_EFFECTIVE_COLUMN DESC) as seqnum
FROM TABLE1 t
WHERE KEY_COLUMN = 'NJCRF' AND
TYPE_COLUMN IN ('SCORE1', 'SCORE2', 'SCORE3') AND
DATE_EFFECTIVE_COLUMN <= '2016-09-17'
) t
WHERE seqnum = 1;
If you want three records per type, just use seqnum <= 3.
I like ROW_NUMBER() for this. You want to partition by TYPE, which will start the row count over for each type, then order by DATE_EFFECTIVE desc, and take only the highest date (the first row):
SELECT *
FROM (
SELECT *,
ROW_NUMBER() over (PARTITION BY TYPE_COLUMN ORDER BY DATE_EFFECTIVE_COLUMN desc) RN
FROM TABLE1
WHERE KEY_COLUMN = 'NJCRF'
AND TYPE_COLUMN IN ('SCORE1', 'SCORE2', 'SCORE3')
AND DATE_EFFECTIVE_COLUMN <= '2016-09-17'
) A
WHERE RN = 1

How to Select Top 100 rows in Oracle?

My requirement is to get each client's latest order, and then get top 100 records.
I wrote one query as below to get latest orders for each client. Internal query works fine. But I don't know how to get first 100 based on the results.
SELECT * FROM (
SELECT id, client_id, ROW_NUMBER() OVER(PARTITION BY client_id ORDER BY create_time DESC) rn
FROM order
) WHERE rn=1
Any ideas? Thanks.
Assuming that create_time contains the time the order was created, and you want the 100 clients with the latest orders, you can:
add the create_time in your innermost query
order the results of your outer query by the create_time desc
add an outermost query that filters the first 100 rows using ROWNUM
Query:
SELECT * FROM (
SELECT * FROM (
SELECT
id,
client_id,
create_time,
ROW_NUMBER() OVER(PARTITION BY client_id ORDER BY create_time DESC) rn
FROM order
)
WHERE rn=1
ORDER BY create_time desc
) WHERE rownum <= 100
UPDATE for Oracle 12c
With release 12.1, Oracle introduced "real" Top-N queries. Using the new FETCH FIRST... syntax, you can also use:
SELECT * FROM (
SELECT
id,
client_id,
create_time,
ROW_NUMBER() OVER(PARTITION BY client_id ORDER BY create_time DESC) rn
FROM order
)
WHERE rn = 1
ORDER BY create_time desc
FETCH FIRST 100 ROWS ONLY)
you should use rownum in oracle to do what you seek
where rownum <= 100
see also those answers to help you
limit in oracle
select top in oracle
select top in oracle 2
As Moneer Kamal said, you can do that simply:
SELECT id, client_id FROM order
WHERE rownum <= 100
ORDER BY create_time DESC;
Notice that the ordering is done after getting the 100 row. This might be useful for who does not want ordering.
Update:
To use order by with rownum you have to write something like this:
SELECT * from (SELECT id, client_id FROM order ORDER BY create_time DESC) WHERE rownum <= 100;
First 10 customers inserted into db (table customers):
select * from customers where customer_id <=
(select min(customer_id)+10 from customers)
Last 10 customers inserted into db (table customers):
select * from customers where customer_id >=
(select max(customer_id)-10 from customers)
Hope this helps....
To select top n rows updated recently
SELECT *
FROM (
SELECT *
FROM table
ORDER BY UpdateDateTime DESC
)
WHERE ROWNUM < 101;
Try this:
SELECT *
FROM (SELECT * FROM (
SELECT
id,
client_id,
create_time,
ROW_NUMBER() OVER(PARTITION BY client_id ORDER BY create_time DESC) rn
FROM order
)
WHERE rn=1
ORDER BY create_time desc) alias_name
WHERE rownum <= 100
ORDER BY rownum;
Or TOP:
SELECT TOP 2 * FROM Customers; //But not supported in Oracle
NOTE: I suppose that your internal query is fine. Please share your output of this.

How to select distinct records based on condition

I have table of duplicate records like
Now I want only one record from duplicate records which has latest created date as How can I do it ?
use row_number():
select EnquiryId, Name, . . .
from (select t.*,
row_number() over (partition by enquiryID order by CreatedDate desc) as seqnum
from table t
) t
where seqnum = 1;
Use ROW_NUMBER function to tag the duplicate records ordered by CreatedDate, like this:
;with CTE AS (
select *, row_NUMBER() over(
partition by EnquiryID -- add columns on which you want to identify duplicates
ORDER BY CreatedDate DESC) as rn
FROM TABLE
)
select * from CTE
where rn = 1