Ranking over several columns - sql

In the process of query optimization I got to following SQL query:
select s.*
from
(
select id, DATA, update_dt, inspection_dt, check_dt
RANK OVER()
(PARTITION by ID
ORDER BY update_dt DESC, DATA) rank
FROM TABLE
where update_dt < inspection_dt or update_dt < check_dt
) r
where r.rank = 1
Query returns the DATA that corresponds to the latest check_dt.
However, what I want to get is:
1. DATA corresponding to latest check_dt
2. DATA corresponding to latest inspection_dt.
One of the trivial solutions - just write two separate queries with a where single condition - one for inspection_dt, and one for check_dt. However, that way it loses initial intent - to shorten the running time.
By observing the source data I noticed the way to implement it - check date is always later than inspection date; knowing that I could just extract the record with the rank = 1 and it will give me DATA corresponding to latest CHECK_DT, and record with the largest rank would correspond to INSPECTION.
However, data I'm afraid data will not be always consistent, so I was looking for more abstract solution.

How about this?
select s.*
from (select id, DATA, update_dt, inspection_dt, check_dt,
RANK() OVER (PARTITION by ID
ORDER BY update_dt DESC, DATA
) as rank_upd,
RANK() OVER (PARTITION by ID
ORDER BY inspection_dt DESC, DATA
) as rank_insp,
FROM TABLE
) r
where r.rank_upd = 1 or r.rank_insp = 1;

Related

SQL Max or empty value grouped by conditions

I have a table like this
and i want my output to look like this
I need to look at the ID and then take max created date and max completed date for that ID. There is also some cases where completed date is still empty so in that case i just need to look at the max created date. Im not sure how to tackle this, doing a group by doesnt account for my multiple scenarios
Use ROW_NUMBER:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY QUOTE_NUMBER
ORDER BY WORKBOOK_CREATED_DATE DESC) rn
FROM yourTable
)
SELECT *
FROM yourTable
WHERE rn = 1;

How to get nth record in a sql server table without changing the order?(sql server)

for example i have data like this(sql server)
id name
4 anu
3 lohi
1 pras
2 chand
i want 2nd record in a table (means 3 lohi)
if i use row_number() function its changes the order and i get (2 chand)
i want 2nd record from table data
can anyonr please give me the query fro above scenario
There is no such thing as the nth row in a table. And for a simple reason: SQL tables represent unordered sets (technically multi-sets because they allow duplicates).
You can do what you want use offset/fetch:
select t.*
from t
order by id desc
offset 1 fetch first 1 row only;
This assumes that the descending ordering on id is what you want, based on your example data.
You can also do this using row_number():
select t.*
from (select t.*,
row_number() over (order by id desc) as seqnum
from t
) t
where seqnum = 2;
I should note that that SQL Server allows you to assign row_number() without having an effective sort using something like this:
select t.*
from (select t.*,
row_number() over (order by (select NULL)) as seqnum
from t
) t
where seqnum = 2;
However, this returns an arbitrary row. There is no guarantee it returns the same row each time it runs, nor that the row is "second" in any meaningful use of the term.

Efficient way to associate each row to latest previous row with condition.(PostgreSQL)

I have a table in which two different kind of rows are inserted:
Some rows represent a datapoint, a key-value pair in a specific of time
Other rows represent a new status, which persist in the future until the next status
In the real problem, I have a timestamp column which stores the order of the events. In the SQL Fiddle example I am using a SERIAL integer field, but it is the same idea.
Here is the example:
http://www.sqlfiddle.com/#!17/a0823/6
I am looking for an efficent way to retrieve each row of the first type with the its status (which is given by the latest status row before current row) associated.
The query on the sqlfiddle link is an example, but uses two subqueries which is very inefficient.
I cannot change the structure of the table nor create other tables, but I can create any necessary index on the table.
I am using PostgreSQL 11.4
The most efficient method is probably to use window functions:
select p.*
from (select p.*,
max(attrvalue) filter (where attrname = 'status_t1') over (partition by grp_1 order by id) as status_t1,
max(attrvalue) filter (where attrname = 'status_t2') over (partition by grp_2 order by id) as status_t2
from (select p.*,
count(*) filter (where attrname = 'status_t1') over (order by id) as grp_1,
count(*) filter (where attrname = 'status_t2') over (order by id) as grp_2
from people p
) p
) p
where attrname not in ('status_t1', 'status_t2');
Here is a db<>fiddle.

How to get first and last record from same group in SQL Server?

I'm a new SQL user and need help.
Let's say I have a vehicle number 123 and I've traveled from Region 3 to final destination Region 4. In between, I've visited Region 1 and 5 as well but that's not my concern.
Simple example would be as follow.
Original Table
Desired Output
How can this be done in SQL query?
You have a sequence number so you can use some form of aggregation. One method is:
select records,
max(case when sequence = 1 then fromregion end) as fromregion,
max(case when sequence = maxsequence then toregion) as toregion
from (select t.*, max(sequence) over (partition by records) as max_sequence
from t
) t
group by records;
Unfortunately, SQL Server doesn't offer "first()" or "last()" as aggregation functions. But it does support first_value() as a window function. This allows you to do the logic without a subquery:
select distinct records,
first_value(fromRegion) over (partition by records order by sequence) as fromregion,
first_value(toRegion) over (partition by records order by sequence desc) as toregion
from t;

Find the second largest value with Groupings

In SQL Server, I am attempting to pull the second latest NOTE_ENTRY_DT_TIME (items highlighted in screenshot). With the query written below it still pulls the latest date (I believe it's because of the grouping but the grouping is required to join later). What is the best method to achieve this?
SELECT
hop.ACCOUNT_ID,
MAX(hop.NOTE_ENTRY_DT_TIME) AS latest_noteid
FROM
NOTES hop
WHERE
hop.GEN_YN IS NULL
AND hop.NOTE_ENTRY_DT_TIME < (SELECT MAX(hope.NOTE_ENTRY_DT_TIME)
FROM NOTES hope
WHERE hop.GEN_YN IS NULL)
GROUP BY
hop.ACCOUNT_ID
Data sample in the table:
One of the "easier" ways to get the Nth row in a group is to use a CTE and ROW_NUMBER:
WITH CTE AS(
SELECT Account_ID,
Note_Entry_Dt_Time,
ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY Note_Entry_Dt_Time DESC) AS RN
FROM dbo.YourTable)
SELECT Account_ID,
Note_Entry_Dt_Time
FROM CTE
WHERE RN = 2;
Of course, if an ACCOUNT_ID only has 1 row, then it will not be returned in the result set.
The OP's statement "The row will not always be 2." from the comments conflicts with their statement "I am attempting to pull the second latest NOTE_ENTRY_DT_TIME" in the question. At a best guess, this means that the OP has rows with the same date, that could be the "latest" date. If so, then would simply need to replace ROW_NUMBER with DENSE_RANK. Their sampple data, however, doesn't suggest this is the case.
You can use window functions:
select *
from (
select
n.*,
row_number() over(partition by account_id order by note_entry_dt_time desc) rn
from notes n
) t
where rn = 2