Query uses rank() needs optimization - sql

select * from
(
Select DISTINCT
DocManREPORT_View.DOCINPUTDATE,
DocManREPORT_View.REACTIVATEDATE,
DocManREPORT_View.TRACENO,
DocManREPORT_View.CLIENTNAME,
DocManREPORT_View.DOCUMENTID,DocManREPORT_View.BARCODEID,
DocManREPORT_View.INPUTMODE,
DocManREPORT_View.INPUTSOURCE,PI.start_time,
RANK() OVER (PARTITION BY process_instance_id
ORDER BY last_modified_date desc) rank,
PI.STATUS AS PROCESSSTATUS
FROM DocManREPORT_View
INNER JOIN PROCESS_INSTANCE PI ON
(pi.instance_id = DocManREPORT_View.process_instance_id)
)
where rank = 1;

I presume DISTINCT clause could screw up the performance. I would recommend you to get rid of it by including into partition by clause and have a look what have you got.

If you can, try to use the
RANK() OVER (PARTITION BY process_instance_id
ORDER BY last_modified_date desc) rank,
Inside the VIEW, since I tihnk the View has already every data to make this step inside.

Related

Get last two rows from a row_number() window function in snowflake

Hopefully, someone can help me...
I'm trying to get the last two values from a row_number() window function. Let's say my results contain row numbers up to 6, for example. How would it be possible to get the rows where the row number is 5 and 6?
Let me know if it can be done with another window function or in another way.
Kind regards,
Using QUALIFY:
SELECT *
FROM tab
QUALIFY ROW_NUMBER() OVER(ORDER BY ... DESC) <= 2;
This approach could be further extended to get two rows per each partition:
SELECT *
FROM tab
QUALIFY ROW_NUMBER() OVER(PARTITION BY ... ORDER BY ... DESC) <= 2;
You can use top with order by desc like:
select top 2 row_number() over([partition by] [order by]) as rn
from table
order by rn desc
I'd say #Shmiel is the formal and elegant way, just in case, would be the same as :
WITH CTE AS
(SELECT product,
user_id,
ROW_NUMBER() OVER (PARTITION BY user_id order by product desc)
as RN
FROM Mytable)
SELECT product, user_id
FROM CTE
WHERE RN < 3;
You will use order by [order_condition] with "desc". And then you will use RN(row number) to select as many rows as you want

ROW_NUMBER() with Qualify clause in Vertica

select a.pm_id, a.pm_name
from loc_table a
qualify row_number() over(partition by pm_id order by pm_name asc) =1;
Can we write it this way in Vertica, I tried it but the qualify keyword is not taken by Vertica and the from Clause has to be at the end.
Can anybody explain what the above query does and how can we achieve the same in Vertica.
Vertica does not have the QUALIFY clause.
What it does have, is the analytic limit clause
Re-write your query like below, and run an easy global search-replace if you need that often:
SELECT
a.pm_id
, a.pm_name
FROM loc_table a
LIMIT 1 OVER(PARTITION BY pm_id ORDER BY pm_name ASC);
I think you need a subquery in Vertica:
select pm_id, pm_name
from (select l.pm_id, l.pm_name,
row_number() over (partition by pm_id order by pm_name asc) as seqnum
from loc_table l
) l
where seqnum = 1;
This is pretty much exactly what qualify does. Just like having filters on aggregation columns, qualify filters on window functions.

Row Number() Order Issue

Apologies in advance if this specific scenario has been asked previously, but I can't seem to get these to order properly (which is probably from staring at it for too long).
I'm using Netezza/Oracle, and In the data set below - I basically need the order_num to result in 1,2,2,2,2,3,4 (basically grouping Department and Desc1 (desc1 is not unique as there are different codes for each year, but I'm only interested in the type, not year).) Among other attempts, I've tried:
row_number () over (partition by a.department order by desc1) order_num
Which orders it alphabetically. I've also ordered by seq_no and desc1 - but that only works if I needed it alphabetically.
Thanks in advance.
Assuming that the Country is consistent with the grouping as you have shown; if you get the minimum seq_no per country in either a CTE or sub-query you can use this value in your dense_rank function, e.g.
SELECT
m.Department,
m.Desc1,
m.seq_no,
m.Country,
m.beg_date,
m.end_date,
dense_rank() OVER(PARTITION BY m.Department ORDER BY mintbl.MinSeq)
FROM dbo.mytable AS m
JOIN ( SELECT min(m.seq_no) AS MinSeq,
m.Department,
m.Country
FROM dbo.mytable AS m
GROUP BY m.Department,m.Country
) AS mintbl ON mintbl.Department = m.Department AND mintbl.Country = m.Country
ORDER BY m.seq_no
You want dense_rank() rather than row_number():
dense_rank() over (partition by a.department order by desc1) order_num
If you want to maintain the seqnum order, you can use a subquery to calculate:
min(seqnum) over (partition by department, desc1) as min_seqnum
Then in the outer query use min_seqnum for the order by.
Can you not use
dense_rank() over(partition by department, desc1 order by beg_date)
Or...
dense_rank() over(partition by department,desc1 order by seq_no)

Windowed functions cannot be used in the context of another windowed function or aggregate.SQL Server error

I am trying to get the row number for the rank. Below is the query,
SELECT *
FROM (
SELECT DISTINCT TOP 100 PERCENT rank() OVER (
PARTITION BY o.panel_id
,o.combo_type_code ORDER BY row_number() OVER (
ORDER BY o.panel_id
)
) AS rank
,panel_code
FROM tbk_offer_head o
,tbk_combo_type ct
,tbk_panel p
WHERE o.panel_id = p.panel_id
AND o.combo_type_code = ct.combo_type_code
AND o.panel_id IN (
SELECT p.panel_id
FROM tbk_panel p
WHERE p.campaign_id = 7392
)
) A
WHERE A.rank = 1
ORDER BY panel_code
Getting the error, Windowed functions cannot be used in the context of another one. Please help how can i solve this problem.
I have no idea what you are really trying to do. But you should definitely learn to use proper, explicit JOIN syntax.
But there is no need to nest the functions. Your logic should be equivalent to:
row_number() over (partition by o.panel_id, o.combo_type_code
order by o.panel_id
) as rank
Why does this use row_number() instead of rank()? Your original order by used row_number() which never has duplicates. Hence, if rank() could use it, the values would all be distinct, and the rank() would be equivalent to row_number() -- even when panel_id is duplicated.

How to write a derived query in Netezza SQL?

I need to query the data for inviteid based. For each inviteid I need to have the top 5 IDs and ID Descriptions.
I see that the query I wrote is taking all the time in the world to fetch. I didn't notice an error or anything wrong with it.
The code is:
SELECT count(distinct ID),
IDdesc,
inviteid,
A
FROM (
SELECT
ID,
IDdesc,
inviteid,
RANK() OVER(order by invtypeid asc ) A
FROM Fact_s
--WHERE dateid ='26012013'
GROUP BY invteid,IDdesc,ID
ORDER BY invteid,IDdesc,ID
) B
WHERE A <=5
GROUP BY A, IDDESC, inviteid
ORDER BY A
I'm not sure I understood you requirement completely, but as far as I can tell the group by in the derived table is not necessary (just as the order by as Mark mentioned) because you are using a window function.
And you probably want row_number() instead of rank() in there.
Including the result of rank() in the outer query seems dubious as well.
So this leads to the following statement:
SELECT count(distinct ID),
IDdesc,
inviteid
FROM (
SELECT ID,
IDdesc,
inviteid,
row_number() OVER (order by invtypeid asc ) as rn
FROM Fact_s
) B
WHERE rn <= 5
GROUP BY IDDESC, inviteid;